Get data attributes with Nokogiri

Get data attributes with Nokogiri - css

I'm scraping a site that has a number of divs with the same ".pane" class and same "data-pane" data attributes.
input = doc.css('.pane[data-pane]')
How do I filter or select from the above to get the div which has a "data-pane" attribute equal to a specific value?

You can just treat it as you would any other attribute with the usual CSS syntax:
input = doc.css('.pane[data-pane="the value"]')

Related

Is there a way to update the css selector within the shiny screenshotButton as there's choices selected in pickerInput?

I'm using shiny library(shinyscreenshot). I have a datatable within a tabBox and would like the user to be able to download a screenshot of the datatable with a specific div id that changes based on the pickerInput. I've noticed, after I select choices in the pickerInput, the new datatable rendered has a different div id. The different div's for the datatables are as follows: #DataTables_Table_0, #DataTables_Table_1, #DataTables_Table_2, #DataTables_Table_3, etc.
Is there a way I can still use "screenshotButton()" and be able to update the selector to be whatever selector matches the string?
I ran into this issue when I used the following code and nothing downloaded after the first download:
screenshotButton(filename = "TEST.png",label = "Download",download = TRUE, scale =3, selector = "#DataTables_Table_0")
I thought about using an attribute selector similar to:
[attribute~="value"]
where my code would look something like this to capture a specific string match within the selector but no luck.
screenshotButton(filename = "TEST.png",label = "Download",download = TRUE, scale =3, [id~="DataTables_Table_"])
Also, I used the id = "#table" which produces a valid download but I would like to stick with the #DataTables_Table id as this would allow for a png which shows the datatable in the most compact, cropped format without any extra filters or tabs showing within the screenshot.
Thank you in advance.

You were very close! you can use the *= operator to aim for attribute that begins with selector:
selector = "[id*='DataTables_Table_']"

Unable to scrape the data using bs4

I am trying to scrape the star rating for the "value" data from the Trip Advisor hotels but I am not able to get the data using class name:
Below is the code which I have tried to use:
review_pages=requests.get("https://www.tripadvisor.com/Hotel_Review-g60745-d94367-Reviews-Harborside_Inn-Boston_Massachusetts.html")
soup3=BeautifulSoup(review_pages.text,'html.parser')
value=soup3.find_all(class_='hotels-review-list-parts-AdditionalRatings__bubbleRating--2WcwT')
Value_1=soup3.find_all(class_="hotels-review-list-parts-AdditionalRatings__ratings--3MtoD")
When I am trying to capture the values it is returning an empty list. Any direction would be really helpful. I have tried mutiple class names which are in that page but I am getting various fields such as Data,reviews ect but I am not able to get the bubble ratings for only service.

You can use an attribute = value selector and pass the class in with its value as a substring with ^ starts with operator to allow for different star values which form part of the attribute value.
Or, more simply use the span type selector to select for the child spans.
.hotels-hotel-review-about-with-photos-Reviews__subratings--3DGjN span
In this line:
values=soup3.select('.hotels-hotel-review-about-with-photos-Reviews__subratings--3DGjN [class^="ui_bubble_rating bubble_"]')
The first part of the selector, when reading from left to right, is selecting for the parent class of those ratings. The following space is a descendant combinator combining the following attribute = value selector which gathers a list of the qualifying children. As mentioned, you can replace that with just using span.
Code:
import requests
from bs4 import BeautifulSoup
import re
review_pages=requests.get("https://www.tripadvisor.com/Hotel_Review-g60745-d94367-Reviews-Harborside_Inn-Boston_Massachusetts.html")
soup3=BeautifulSoup(review_pages.content,'lxml')
values=soup3.select('.hotels-hotel-review-about-with-photos-Reviews__subratings--3DGjN [class^="ui_bubble_rating bubble_"]') #.hotels-hotel-review-about-with-photos-Reviews__subratings--3DGjN span
Value_1 = values[-1]
print(Value_1['class'][1])
stars = re.search(r'\d', Value_1['class'][1]).group(0)
print(stars)
Although I use re, I think it is overkill and you could simply use replace.

Dynamic predicate in XQuery

I notice one fact that when predicate has dynamic field to compare then it doesn't work.
For example:
db:open("library")//book[$filterFields = $pattern]
for this I get 0 results,
but when I put for example category instead of $filterField then I have some results.
How can I use variable in predicate as field?

If $filterFields is supposed to contain a list of element names, you can possibly use the following query:
db:open("library")//book
[*[name() = $filterFields] = $pattern]

How i can show default values of each column in ERD that generated by powerdesiner

I want to show default value of each column in my ERD.How I can do that ?
I don't see something about it in content tab in symbol format windows. Only in preview tab in table properties, I can see default values for columns in generated SQL. the Power designer version is 16.6.

You can modify the attributes displayed for columns under Tools > Display Preferences > Table > Advanced > Columns > List columns:
Use Select > (Select Attributes) Default Value to add the default value.
This list of attributes does not provide long text attribute, like Description. For these, you might be able to work a solution using calculated attributes.

CSS3 attribute filtering on names with special characters

For Chartist line graphs, I'm trying to do conditional coloring of points via CSS. The value that determines the color is the ",0" on the ct:value attribute, as follows:
<line x1="555" y1="317.2833251953125" x2="555.01" y2="317.2833251953125" class="ct-point" ct:value="63624984960667,0"></line>
What attribute selector can we use that styles all line elements such that ct:value contains ",0"?
I've attempted the following without success:
line[ct\:value~=",0"] {
//etc
}
UPDATE: Modified original code to show tilda (~) instead of caret (^).

The attribute matching expression, val^="attribute" means the attribute has to start with attribute, not just contain it. You want *= instead for just matching somewhere in the string:
line[ct\:value*=",0"] {
...

Develop Reference

r css asp.net wordpress firebase qt symfony nginx http apache-flex

Get data attributes with Nokogiri - css

I'm scraping a site that has a number of divs with the same ".pane" class and same "data-pane" data attributes. input = doc.css('.pane[data-pane]') How do I filter or select from the above to get the div which has a "data-pane" attribute equal to a specific value?

You can just treat it as you would any other attribute with the usual CSS syntax: input = doc.css('.pane[data-pane="the value"]')

Related

Is there a way to update the css selector within the shiny screenshotButton as there's choices selected in pickerInput?

Unable to scrape the data using bs4

Dynamic predicate in XQuery

How i can show default values of each column in ERD that generated by powerdesiner

CSS3 attribute filtering on names with special characters

Categories

Resources