Finding element location in Shiny - r

I have a Shiny app with varying table sizes depending on inputs and I am trying to test the app using RSelenium. I would like to find the element location using XPath syntax. Finding one element using exact node works fine, however, finding several ones does not return any results at all. My Shiny app cant be shared but the same results occur on a Shiny hosted app by RStudio.
library(RSelenium)
rd <- rsDriver()
r <- rd$client
r$navigate('https://shiny.rstudio.com/gallery/datatables-demo.html')
r$switchToFrame(r$findElements("css selector", "iframe")[[1]])
e <- r$findElements('xpath', "//*[#id='DataTables_Table_0']/tbody/tr[1]/td[3]")
e[[1]]$getElementText()
e[[1]]$getElementLocation()[c('x', 'y')]
# Works as expected
# Find all elements - does not find any elements
e_all <- r$findElements('xpath', "//*[#id='DataTables_Table_0']/tbody/tr[*]/td[*]")

In the first XPath you are selecting the third td element that is a child of the first tr element under tbody.
In the second XPath you are selecting the td element (only if it has a child element) that is a child of a tr element that has child elements(which it has to, since you want to select the td child element(s)).
It is difficult to tell without some sample data, but I'm guessing that none of the td elements have any child elements, and so it isn't selecting anything.
Adjust the XPath to remove both of the predicate filters:
//*[#id='DataTables_Table_0']/tbody/tr/td
That should select all of the columns from all of the rows in that table.
If that selects too many columns and you need to restrict it, provide some example content and describe what you want to select or exclude, and we can help you add an appropriate predicate filter.

Related

Is there a way to update the css selector within the shiny screenshotButton as there's choices selected in pickerInput?

I'm using shiny library(shinyscreenshot). I have a datatable within a tabBox and would like the user to be able to download a screenshot of the datatable with a specific div id that changes based on the pickerInput. I've noticed, after I select choices in the pickerInput, the new datatable rendered has a different div id. The different div's for the datatables are as follows: #DataTables_Table_0, #DataTables_Table_1, #DataTables_Table_2, #DataTables_Table_3, etc.
Is there a way I can still use "screenshotButton()" and be able to update the selector to be whatever selector matches the string?
I ran into this issue when I used the following code and nothing downloaded after the first download:
screenshotButton(filename = "TEST.png",label = "Download",download = TRUE, scale =3, selector = "#DataTables_Table_0")
I thought about using an attribute selector similar to:
[attribute~="value"]
where my code would look something like this to capture a specific string match within the selector but no luck.
screenshotButton(filename = "TEST.png",label = "Download",download = TRUE, scale =3, [id~="DataTables_Table_"])
Also, I used the id = "#table" which produces a valid download but I would like to stick with the #DataTables_Table id as this would allow for a png which shows the datatable in the most compact, cropped format without any extra filters or tabs showing within the screenshot.
Thank you in advance.
You were very close! you can use the *= operator to aim for attribute that begins with selector:
selector = "[id*='DataTables_Table_']"

R based Web Scraper for Cabela's using rvest

Maybe slightly out of the ordinary, but I want to track down a particular rifle that I am interested in purchasing. I am familiar with R, so I started down that path, but I'm guessing there are better options.
What I want to do is check a web page hourly to see if the availability has changed. If it has, I get a text message.
I started out using rvest and twilio. The problem is that I can't figure out how to get all the way down to the data that I need. The page has an "Add to cart" button that isn't shown if the item isn't available using css style display:none.
I've tried various ways of trying to get down to that particular div by using id names, css classes, xpath, etc, but keep coming up with nothing.
Any ideas? is it the formatting of the div name? Or do I have to manually dig through each nested div?
EDIT: I was able to find the right xpath to work. But as pointed out below, you can't see the style.
EDIT2 - in the out of stock div, the text "In Select Stores Only" is displayed, but I can't figure out how to isolate it.
#Schedule script to run every hour
library(rvest)
library(twilio)
#vars for sms
Sys.setenv(TWILIO_SID = "xxxxxxxxxxx")
Sys.setenv(TWILIO_TOKEN = "xxxxxxxxxxx")
#example, two url's - one with in stock item, one without
OutStockURL <- read_html("https://www.cabelas.com/shop/en/marlin-1895sbl-lever-action-rifle?searchTerm=1895%20sbl")
InStockURL <- read_html("https://www.cabelas.com/shop/en/thompson-center-venture-ii-bolt-action-centerfire-rifle")
#div id that contains information on if product is in stock or not
instockdivid <- "WC_Sku_List_TableContent_3074457345620110138_Price & Availability_1_16_grid"
outstockdivid <- "WC_Sku_List_TableContent_24936_Price & Availability_1_15_grid"
#inside the div is a button that is either displayed or not based on availability
instockbutton <- 'id="SKU_List_Widget_Add2CartButton_3074457345620110857_table"'
outstockbutton <- 'id="SKU_List_Widget_Add2CartButton_3074457345617539137_table"'
#if item is unavailable, button style is set to display:none - style="display: none;"
test <- InStockURL %>%
html_nodes("div")
#xpath to buttons
test <- InStockURL %>%
html_nodes(xpath = '//*
[#id="SKU_List_Widget_Add2CartButton_3074457345620110857_table"]')
test2 <- OutStockURL %>%
html_nodes(xpath = '//*
[#id="SKU_List_Widget_Add2CartButton_3074457345617539137_table"]')
#not sure where to go from here to see if the button is visible or not
#if button is displayed, send email
tw_send_message(
to = "+15555555555",
from = "+5555555555",
body = paste("Your Item Is Available!")
)

Unable to scrape the data using bs4

I am trying to scrape the star rating for the "value" data from the Trip Advisor hotels but I am not able to get the data using class name:
Below is the code which I have tried to use:
review_pages=requests.get("https://www.tripadvisor.com/Hotel_Review-g60745-d94367-Reviews-Harborside_Inn-Boston_Massachusetts.html")
soup3=BeautifulSoup(review_pages.text,'html.parser')
value=soup3.find_all(class_='hotels-review-list-parts-AdditionalRatings__bubbleRating--2WcwT')
Value_1=soup3.find_all(class_="hotels-review-list-parts-AdditionalRatings__ratings--3MtoD")
When I am trying to capture the values it is returning an empty list. Any direction would be really helpful. I have tried mutiple class names which are in that page but I am getting various fields such as Data,reviews ect but I am not able to get the bubble ratings for only service.
You can use an attribute = value selector and pass the class in with its value as a substring with ^ starts with operator to allow for different star values which form part of the attribute value.
Or, more simply use the span type selector to select for the child spans.
.hotels-hotel-review-about-with-photos-Reviews__subratings--3DGjN span
In this line:
values=soup3.select('.hotels-hotel-review-about-with-photos-Reviews__subratings--3DGjN [class^="ui_bubble_rating bubble_"]')
The first part of the selector, when reading from left to right, is selecting for the parent class of those ratings. The following space is a descendant combinator combining the following attribute = value selector which gathers a list of the qualifying children. As mentioned, you can replace that with just using span.
Code:
import requests
from bs4 import BeautifulSoup
import re
review_pages=requests.get("https://www.tripadvisor.com/Hotel_Review-g60745-d94367-Reviews-Harborside_Inn-Boston_Massachusetts.html")
soup3=BeautifulSoup(review_pages.content,'lxml')
values=soup3.select('.hotels-hotel-review-about-with-photos-Reviews__subratings--3DGjN [class^="ui_bubble_rating bubble_"]') #.hotels-hotel-review-about-with-photos-Reviews__subratings--3DGjN span
Value_1 = values[-1]
print(Value_1['class'][1])
stars = re.search(r'\d', Value_1['class'][1]).group(0)
print(stars)
Although I use re, I think it is overkill and you could simply use replace.

Unable to find xpath list trying to use wild card contains text or style

I am trying to find an XPATH for this site the XPath under “Main Lists”. I have so far:
//div[starts-with(#class, ('sm-CouponLink_Label'))]
However this finds 32 matches…
`//div[starts-with(#class, ('sm-CouponLink_Label'))]`[contains(text(),'*')or[contains(Style(),'*')]
Unfortunately in this case I am wanting to use XPaths and not CSS.
It is for this site, my code is here and here's an image of XPATH I am after.
I have also tried:
CSS: div:nth-child(1) > .sm-MarketContainer_NumColumns3 > div > div
Xpath equiv...: //div[1]//div[starts-with(#class, ('sm-MarketContainer_NumColumns3'))]//div//div
Though it does not appear to work.
UPDATED
WORKING CSS: div.sm-Market:has(div >div:contains('Main Lists')) * > .sm-CouponLink_Label
Xpath: //div[Contains(#class, ('sm-Market'))]//preceding::('Main Lists')//div[Contains(#class, ('sm-CouponLink_Label'))]
Not working as of yet..
Though I am unsure Selenium have equivalent for :has
Alternatively...
Something like:
//div[contains(text(),"Main Lists")]//following::div[contains(#class,"sm-Market")]//div[contains(#class,"sm-CouponLink_Label")]//preceding::div[contains(#class,"sm-Market_HeaderOpen ")]
(wrong area)
You can get all required elements with below piece of code:
league_names = [league for league in driver.find_elements_by_xpath('//div[normalize-space(#class)="sm-Market" and .//div="Main Lists"]//div[normalize-space(#class)="sm-CouponLink_Label"]') if league.text]
This should return you list of only non-empty nodes
If I understand this correctly, you want to narrow down further the result of your first XPath to return only div that has inner text or has attribute style. In this case you can use the following XPath :
//div[starts-with(#class, ('sm-CouponLink_Label'))][#style or text()]
UPDATE
As you clarified further, you want to get div with class 'sm-CouponLink_Label' that resides in the 'Main Lists' section. For this purpose, you should try to incorporate the 'Main Lists' in the XPath somehow. This is one possible way (formatted for readability) :
//div[
div/div/text()='Main Lists'
]//div[
starts-with(#class, 'sm-CouponLink_Label')
and
normalize-space()
]
Notice how normalize-space() is used to filter out empty div from the result. This should return 5 elements as expected, here is the result when I tested in Chrome :

QTP - getting value of element

I am beginning with QTP and just cannot find out how to get value of element. For example when I just want to compare the number of results found by google. I tried to select the element with object spy and use Val(Element) to assign the value into variable..but it doesnt work. Could anyone help with this? BTW, I am not sure whether selecting the text (element) to compare with Object spy is correct.
Thanks!
You should use GetROProperty in order to get the text and then parse it for the value.
Looking at a Google results page I see that the result is in a paragraph with id=resultStats in the 3rd bold tag.
<p id="resultStats"> Results <b>1</b> - <b>10</b> of about
<b>2,920,000</b>
for <b>qtp</b>. (<b>0.22</b> seconds)</p>
So the following script gets the number (as a string with commas).
Browser("micclass:=Browser")
.Page("micclass:=Page")
.WebElement("html id:=resultStats")
.WebElement("html tag:=b","index:=2").GetROProperty("innertext")

Resources