Loop child nodes inside parent in AngleSharp c# or vb - anglesharp

I am using AngleSharp to parse a html string.
I can get it to parse an out div that contains some inner divs / h3 tags i want to extract, but I can't work out how to do
I have so far
Dim outer_linq = document.All.Where(Function(w) w.LocalName = "div" AndAlso w.ClassList.Contains("the-product"))
For Each item In outer_linq
If item.LocalName = "h1" AndAlso item.ClassList.Contains("the-product-title") Then
' Found h1.the-product-title, so do something with it here
End If
If item.LocalName = "div" AndAlso item.ClassList.Contains("price") Then
' Found div.price, so do something with it here
End If
Next
So it is finding everyting inside of div.the-product, but how do i look through div.the-product and get the h1.the-product-title and div.price for each set in div.the-product ?
There are several div.the-product and each one contains a h1.the-product-title and div.price
Using VB but c# would be ok too.
Thanks if anyone can help.

While you can leverage techniques such as LINQ in AngleSharp we encourage everyone to use the DOM (Document Object Model) as much as possible.
Instead of using document.All.Where you should just use document.QuerySelectorAll:
document.QuerySelectorAll("div.the-product")
You could even then perform the nesting directly, e.g.,
document.QuerySelectorAll("div.the-product h1.the-product-title")
would find all h1 elements having a the-product-title class and being below (descendants) of div elements having a the-product class. If you want to have children (instead of descendants) just use the > operator:
document.QuerySelectorAll("div.the-product > h1.the-product-title")
What your code above makes wrong is that you use item again. All retrieved items are actually already div elements (that's what you iterate over), so they cannot be h1 elements, too.
An easy fix of your code above is that you use the outer loop as written above, but within you would write, e.g.,
Dim allInnerH1 = item.QuerySelectorAll("h1.the-product-title")
Dim allInnerPrices = item.QuerySelectorAll("div.price")

Related

Testcafe deeply equality fails when a css formatted string is compared with a normally constructed string

Trying to compare two sentences (deep equal) with one of them having been created with the help of some css formatting fails. The string I want to compare did have a couple of space separators added with
.space:after {
content: " ";
}
the following are the statements:
stmt1 = 'this is the text'
stmt2 = 'this<span className="space"></span>is the<span className="space"></span>text'
on the UI both appear identical. But when the tests are run in testcafe stmt2 appears as 'thisis thestring'
Can someone help me fix this?
This issue is not related to TestCafe. CSS pseudo elements like :before and :after are not parts of the DOM, so their content is not included in the textContent property. You can try to implement some custom solution using ClientFunction.

Unable to find xpath list trying to use wild card contains text or style

I am trying to find an XPATH for this site the XPath under “Main Lists”. I have so far:
//div[starts-with(#class, ('sm-CouponLink_Label'))]
However this finds 32 matches…
`//div[starts-with(#class, ('sm-CouponLink_Label'))]`[contains(text(),'*')or[contains(Style(),'*')]
Unfortunately in this case I am wanting to use XPaths and not CSS.
It is for this site, my code is here and here's an image of XPATH I am after.
I have also tried:
CSS: div:nth-child(1) > .sm-MarketContainer_NumColumns3 > div > div
Xpath equiv...: //div[1]//div[starts-with(#class, ('sm-MarketContainer_NumColumns3'))]//div//div
Though it does not appear to work.
UPDATED
WORKING CSS: div.sm-Market:has(div >div:contains('Main Lists')) * > .sm-CouponLink_Label
Xpath: //div[Contains(#class, ('sm-Market'))]//preceding::('Main Lists')//div[Contains(#class, ('sm-CouponLink_Label'))]
Not working as of yet..
Though I am unsure Selenium have equivalent for :has
Alternatively...
Something like:
//div[contains(text(),"Main Lists")]//following::div[contains(#class,"sm-Market")]//div[contains(#class,"sm-CouponLink_Label")]//preceding::div[contains(#class,"sm-Market_HeaderOpen ")]
(wrong area)
You can get all required elements with below piece of code:
league_names = [league for league in driver.find_elements_by_xpath('//div[normalize-space(#class)="sm-Market" and .//div="Main Lists"]//div[normalize-space(#class)="sm-CouponLink_Label"]') if league.text]
This should return you list of only non-empty nodes
If I understand this correctly, you want to narrow down further the result of your first XPath to return only div that has inner text or has attribute style. In this case you can use the following XPath :
//div[starts-with(#class, ('sm-CouponLink_Label'))][#style or text()]
UPDATE
As you clarified further, you want to get div with class 'sm-CouponLink_Label' that resides in the 'Main Lists' section. For this purpose, you should try to incorporate the 'Main Lists' in the XPath somehow. This is one possible way (formatted for readability) :
//div[
div/div/text()='Main Lists'
]//div[
starts-with(#class, 'sm-CouponLink_Label')
and
normalize-space()
]
Notice how normalize-space() is used to filter out empty div from the result. This should return 5 elements as expected, here is the result when I tested in Chrome :

Justify text according to a size as opposed to string length? ASP.NET

I have listbox with text in it, and I was asked to see if I could just justify its contents after the dash. My resulting code produced something like this:
Which works fine for scenarios where the text to the left of the dash is less than the max length found from the other items in the listbox (i.e. (B20) is less than (B15-B19), which is the longest entry found, so add some whitespace before the dash).
The issue, though, is that if the text before the dash is same length, it still looks like it isn't justified. Example:
Is there a way to truly line up all the dashes? I would imagine I would have to look at the actual pixel length of the characters before the dash as opposed to the length?
Notes:
I am using ASP.NET Webforms
VB.NET
The text for each item in the listbox is all one string
Right now, my method to accomplish what you see in the first picture is as follows:
Public Sub JustifyDisplayName()
Const ACCOUNT_FOR_DASH As Integer = 4
Dim maxCharCount As Integer = 0
Dim whiteSpace As String = HttpUtility.HtmlDecode(" ")
'Find which one is the longest code
For Each element As TextEntry In Me
If element.Value.Length > maxCharCount Then
maxCharCount = element.Value.Length
End If
Next
'Now, extend the '-' to the max for all items
For Each element As TextEntry In Me
'See how much white space we need to inject
Dim paddingNeeded As Integer = maxCharCount - element.Value.Length
Dim tempDisplay As StringBuilder = New StringBuilder(element.Value)
If paddingNeeded > 0 Then
tempDisplay.Append(CChar(whiteSpace), paddingNeeded + ACCOUNT_FOR_DASH)
tempDisplay.Append(" - " & element.Description)
End If
tempDisplay.Append(" - " & element.Description)
element.DrillDownDisplayNameJustified = tempDisplay.ToString()
Next
End Sub
Thanks.
If you used a fixed-width font, you could make this all much easier. In addition to good ol' Courier, I believe there are others.
If you don't, you're not going to be able to get exactly the right width. You could get close, but you won't get it exactly, because the difference in length between (H60-H95) and (I00-I99) as they are rendered may not evenly divide into increments of one .
But if you really want to give this a try, you'll have to use the System.Drawing namespace, the Graphics class, and a method on Graphics called MeasureString. This will be just to get the lengths of the strings in your selected font, though: System.Drawing doesn't apply to web apps.
If you could append spaces to short items before the dash so that you always have the same number of characters before the dash, you may consider using Monospaced Fonts, where each character occupies the same width - Ref: Similar Question.

Using selenium CSS selector for multiple things

This is in Perl if it matters. I have several lists of links that collapse and expand. I know how many there are from using
get_xpath_count('//li/a')
The problem is I need to get a list of the names of these actual links. I've tried using xpath, but haven't found much luck, and was hoping CSS selectors would be able to help. I've tried using
get_text('css=li a:nth-child('.$i.')'
which prints out a [-] icon next to a link, the very top link in the list, and then an out of range error. I'm not familiar was CSS selectors at all, so any help would be great. If I left out important info, please let me know,
Try this (in pseudo-code, because I avoid Perl like the plague):
list linkNames;
count = selenium.get_xpath_count('//li/a');
for (i = 1; i <= count; i++) {
list.append(selenium.get_text('xpath=(//li/a)[' + i +']');
}
Note:
XPath expressions count from 1 to n, not 0 to n-1 like most C-derived languages.
The XPath form for selecting the i'th match of a pattern is (pattern)[i], not pattern[i].
Selenium doesn't assume the (pattern)[i] locator is an XPath, so you need say so by starting it with xpath=.

asp.net: explicit localization & combining strings

This seems like it should be a simple thing to do, but I can't figure it out.
I have a localized resource that I'm using in two places - one as a col. header in a datagrid, and then as a descriptor beside a field when the user edits a row.
The text of the label looks like:
Text="<%$Resources:Global,keyName%>"
However, I'd like to add a trailing : to the label - except if I change the above to
Text="<%$Resources:Global,keyName%>:"
then the : is the only thing that shows up! I've tried it with simple strings, so there's nothing special about the colon char that causes this.
Surely I don't have to have 2 different resources?
Have you tried Text="<%$Resources:Global,keyName%>" + ":" ?
You'd basically be concatenating two strings. Or treat them as two strings
StringBuilder t;
t.append(<%$Resources:Global,keyName%>)
t.append(":")
Text = t;
Assuming you need to keep the : together for styling reasons, replace the label with a span:
<%=Resources.Global.keyName %>:
Well, sometimes the obvious isn't so obvious until someone else looks at it:
Text="<%$Resources:Global,keyName%>" /> :
Just move the : outside the label tag, and all is well.

Resources