I am trying to web scrape the data from the Flipkart site. The link for the webpage is as follows:
https://www.flipkart.com/mi-a1-black-64-gb/product-reviews/itmexnsrtzhbbneg?aid=overall&pid=MOBEX9WXUSZVYHET
I need to automate navigation to the NEXT page by clicking on NEXT button the webpage. Below is the code I'm using
nextButton <-remDr$findElement(value ='//div[#class="_2kUstJ"]')$clickElement()
Error
Selenium message:Element is not clickable at point
I even tried scrolling the webpage as suggested by many stackoverflow questions using the below code
remDr$executeScript("arguments[0].scrollIntoView(true);", nextButton)
But this code is also giving error as
Error in checkError(res) : Undefined error in httr call. httr output: No method for S4 class:webElement
Kindly suggest the solution. I'm using firefox browser and selenium to automate using R programming.
If you do not mind using Chrome driver, the following code worked:
eCaps <- list(chromeOptions = list(
args = c('--headless', '--disable-gpu', '--window-size=1880,1000', "--no-sandbox", "--disable-dev-shm-usage")
))
remDr <- rsDriver(port = 4565L,browser = "chrome",extraCapabilities = eCaps)
remCl <- remDr[["client"]]
remCl$navigate("https://www.flipkart.com/mi-a1-black-64-gb/product-reviews/itmexnsrtzhbbneg?aid=overall&pid=MOBEX9WXUSZVYHET")
remCl$findElement(using = "css selector", "._3fVaIS > span:nth-child(1)")$clickElement()
We shall first scroll to the end of the page and then click Next.
#Navigate to webpage
remDr$navigate("https://www.flipkart.com/mi-a1-black-64-gb/product-reviews/itmexnsrtzhbbneg?aid=overall&pid=MOBEX9WXUSZVYHET")
#Scroll to the end
webElem <- remDr$findElement("css", "html")
webElem$sendKeysToElement(list(key="end"))
#click on Next
remDr$findElement(using = "xpath", '//*[#id="container"]/div/div[3]/div/div/div[2]/div[13]/div/div/nav/a[11]/span')$clickElement()
Related
I am scraping yahoo finance and on financial tables there are "quarterly" and "expand all" clicks. When I inspect on buttons gives me <button class= expandPF Fz(s)....................." and I don't how to do it in CSS. the link and my code is;
webElem <- remDr$findElement(using = "class", ".expandPf")
But it gives error. I also tried using = "CSS" but same error.
Please Help!! :)) Thanks in advance
I also tried;
> webElem <-remDr$findElement("xpath", "//*[#id='Col1-1-Financials-Proxy']/section/div[2]/button/div/span")$clickElement()
Selenium message:No active session with ID b3685d1143434ead399da8235261106f
Error: Summary: NoSuchDriver
Detail: A session is either terminated or not started
Further Details: run errorDetails method
but get this error.
If anyone has an idea please help :))
You can try using xpath to click Expand and Collapse
url = 'https://finance.yahoo.com/quote/ADESE.IS/financials?p=ADESE.IS'
library(RSelenium)
driver = rsDriver(port = 4492L, browser = c("firefox"))
remDr <- driver[["client"]]
remDr$navigate(url)
remDr$findElement('xpath', '//*[#id="Col1-1-Financials-Proxy"]/section/div[2]/button')$clickElement()
I am trying to scrape a page, getting the move list of a game of chess, which is located in the menu on the right, under the "moves" tab.
library(RSelenium)
url <- "https://play.xiangqi.com/game/oX00ly"
rD <- RSelenium::rsDriver(browser = "firefox", check = F)
remDr <- rD$client
remDr$navigate(url = url)
when manually clicking the Moves tab in the browser, I can get the desired text via
webElem <- remDr$findElement("css selector", ".Wrapper__MovesTabWrapper-sc-13rqht3-2")
webElem$getElementText()[[1]]
which (correctly) returns
[1] "1\np3+1\nP3+1\n2\ne3+5\nH2+3\n3\nh8+7\nH8+7\n4\nh2+3\nR1+1\n5\nc8=9\nH3+2\n6\nc2+1\nE7+5\n7\nh3+4\nA6+5\n8\nh4+3\nR9=6\n9\nr1=3\nR6+6\n10\nc2+2\nH2+3\n11\nr9=8\nC2=3\n12\nr8+3\nR1=4\n13\nc2-1\nR6=8\n14\nr8+4\nH3+1\n15\ne7+9\nC3+5\n16\ne9-7\nR4+3\n17\nc2=1\nR8=9\n18\nh3-4\nR4=6\n19\nc1=2\nR9-1\n20\nr3=2\nC8+7\n21\ne5-3\nR9=8\n22\nh4-3\nR8+2\n23\nh3-2\nR8+2\n24\ne7+5\nH7+8\n25\nr8-5\nC3+1\n26\nr8+2\nH8+7\n27\np9+1\nH7+5\n28\na6+5\nH5+7\n29\nk5=6\nR6=4\n30\na5+6\nR4+3"
Problem
When trying to click the button through RSelenium, by using
webElem <- remDr$findElement("css selector", "#moves-tab")
webElem <-webElem$clickElement() # or webElem$click()
Nothing seems to happen, and I'm at a loss on how to proceed troubleshooting.
Question
How can I switch to the Moves tab by simulating a click (active event listener)?
Bonus pts: is this possible using the rvest package?
Sometimes being too trigger happy is a problem.
Adding
webElem <- webElem$clickElement()
Sys.sleep(2)
solved the problem.
I have been trying to get the element "Excel CSV" on a web page using the remDrv $ findElements in R software, but have not been able to achieve it. how could you call the element using the xpath, css, etc arguments?
i try:
library(RSelenium)
test_link="https://sinca.mma.gob.cl/cgi-bin/APUB-MMA/apub.htmlindico2.cgi?page=pageFrame&header=Talagante¯opath=./RM/D28/Cal/PM25¯o=PM25.horario.horario&from=080522&to=210909&"
rD <- rsDriver(port=4446L, browser = "firefox", chromever = "92.0.4515.107") # runs a chrome browser, wait for necessary files to download
remDrv <- rD$client
#remDrv$open(silent = TRUE)
url<-test_link
remDrv$navigate(url)
remDrv$findElements(using = "xpath", "/html/body/table/tbody/tr/td/table[2]/tbody/tr[1]/td/label/span[3]/a")
link: https://sinca.mma.gob.cl/cgi-bin/APUB-MMA/apub.htmlindico2.cgi?page=pageFrame&header=Talagante¯opath=./RM/D28/Cal/PM25¯o=PM25.horario.horario&from=080522&to=210909&
I'm using R (and RSelenium) to scrape data from ESPN. It's not the first time I use it, but in this case I'm getting an error and I can't sort this out.
Consider this page: http://en.espn.co.uk/premiership-2011-12/rugby/match/142562.html
Let's try to scrape the timeline. If I inspect the page I get the css selector
#liveLeft
As usual, I go with
checkForServer()
remDr <- remoteDriver()
remDr$open()
matchId <- "142562"
leagueString <- "premiership"
seasonString <- "2011-12"
url <- paste0("http://en.espn.co.uk/",leagueString,"-",seasonString,"/rugby/match/",matchId,".html")
remDr$navigate(url)
and the page correctly loads. So far so good. Now when I try to get the nodes with
div<- remDr$findElement(using = 'css selector','#liveLeft')
I get back
Error: Summary: NoSuchElement
Detail: An element could not be located on the page using the given search parameters.
I'm puzzled. I tried also with Xpath and doesn't work. I also tried to get different elements of the page with no luck. The only selector that gives something back is
#scrumContent
From the comments.
The element resides in an iframe and as such the element isnt available to select. This is shown when using js in the console in chrome with document.getElementById('liveLeft'). When on the full page it will return null, i.e. element doesn't exist, even though it is clearly visible. To get around this simply load the iframe instead.
If you inspect the page you will see the scr for the iframe is /premiership-2011-12/rugby/current/match/142562.html?view=scorecard, from the example provided. Navigating to this page instead of the 'full' page will allow the element to be 'visible' and as such selectable to RSelenium.
checkForServer()
remDr <- remoteDriver()
remDr$open()
matchId <- "142562"
leagueString <- "premiership"
seasonString <- "2011-12"
url <- paste0("http://en.espn.co.uk/",leagueString,"-",seasonString,"/rugby/current/match/",matchId,".html?view=scorecard")
# Amend url to return iframe
remDr$navigate(url)
div<- remDr$findElement(using = 'css selector','#liveLeft')
UPDATE
If it would be more applicable to load the iframe contents in a variable and then traverse through that then the following example shows this.
document.getElementById('liveLeft') # Will return null as iframe has seperate DOM
var doc = document.getElementById('win_old').contentDocument # Loads iframe DOM elements in the variable doc
doc.getElementById('liveLeft') # Will now return the desired element.
Generally with Selenium when you have a webpage with frames/iframes you need to use the switchToFrame method of the remoteDriver class:
library(RSelenium)
selServ <- startServer()
remDr <- remoteDriver()
remDr$open()
matchId <- "142562"
leagueString <- "premiership"
seasonString <- "2011-12"
url <- paste0("http://en.espn.co.uk/",leagueString,"-",seasonString,"/rugby/match/",matchId,".html")
remDr$navigate(url)
# check the iframes
iframes <- htmlParse(remDr$getPageSource()[[1]])["//iframe", fun = function(x){xmlGetAttr(x, "id")}]
# iframes[[3]] == "win_old" contains the data switch to this frame
remDr$switchToFrame(iframes[[3]])
# check you can access the element
div<- remDr$findElement(using = 'css selector','#liveLeft')
div$highlightElement()
# get data
ifSource <- htmlParse(remDr$getPageSource()[[1]])
out <- readHTMLTable(ifSource["//div[#id = 'liveLeft']"][[1]], header = TRUE)
Like the beginning to any problem before I post it on stack overflow I think I have tried everything. This is a learning experience for me on how to work with javascript and xml so I'm guessing my problem is there.
My question is how to get the results of clicking on the parcel number links that are javascript links? I've tried getting the xpath of the link and using the $click method which following my intuition but this wasn't right or is at least not working for me.
Firefox 26.0
R 3.0.2
require(relenium)
library(XML)
library(stringr)
initializing_parcel_number <- "00000000000"
firefox <- firefoxClass$new()
firefox$get("http://www.muni.org/pw/public.html")
inputElement <- firefox$findElementByXPath("/html/body/form[2]/table/tbody/tr[2]/td/table[1]/tbody/tr[3]/td[4]/input[1]")
inputElement$sendKeys(initializing_parcel_number)
inputElement$sendKeys(key = "ENTER")
##xpath to the first link. Or is it?
first_link <- "/html/body/table/tbody/tr[2]/td/table[5]/tbody/tr[2]/td[1]/a"
##How I'm trying to click the thing.
linkElement <- firefox$findElementByXPath("/html/body/table/tbody/tr[2]/td/table[5]/tbody/tr[2]/td[1]/a")
linkElement$click()
You can do this using RSelenium. See http://johndharrison.github.io/RSelenium/ . DISCLAIMER I am the author of the RSelenium package. A basic vignette on operation can be viewed at RSelenium basics and
RSelenium: Testing Shiny apps
If you are unsure of what element is selected you can use the highlightElement utility method in the webElement class see the commented out code.
The element click event wont work in this case. You need to simulate a click using javascript:
require(RSelenium)
# RSelenium::startServer # if needed
initializing_parcel_number <- "00000000000"
remDr <- remoteDriver()
remDr$open()
remDr$navigate("http://www.muni.org/pw/public.html")
webElem <- remDr$findElement(using = "name", "PAR1")
# webElem$highlightElement() # to visually check what elemnet is selected
webElem$sendKeysToElement(list(initializing_parcel_number, key = "enter"))
# get first link containing javascript:getParcel
webElem <- remDr$findElement(using = "css selector", '[href*="javascript:getParcel"]')
# webElem$highlightElement() # to visually check what elemnet is selected
# send a webElement as an argument.
remDr$executeScript("arguments[0].click();", list(webElem))
#