SimpleDom Search Via plaintext Text

SimpleDom Search Via plaintext Text - simple-html-dom

I am using "PHP Simple HTML DOM Parser" library and looking forward to find elements based on its text value (plaintext)
For example i need to find span element using its value "Red".
<span class="color">Red</span>
I was expecting bellow code to work but seems that it just replaces the value instead of searching it.
$brand = $html->find('span',0)->plaintext='Red';
I read Manual and also i tried to look in library code itself but was not able to find the solution, kindly advise if i am missing something or it is simply not possible to do via Simple Html DOM Parser.
P.S
Kindly note that i am aware of other ways like regex.

Using $html->find('span', 0) will find the (N)th span where in this case n is zero.
Using $html->find('span',0)->plaintext='Red'; will set the plaintext to Red
If you want to find the elements where the text is Red you could use a loop and omit the 0 to find all the spans.
For example, using innertext instead of plaintext:
$spansWithRedText = [];
foreach($html->find('span') as $element) {
if ($element->innertext === "Red") {
$spansWithRedText[] = $element;
}
}

Related

Get the text associated with a href element in a given page in scrapy

Currently my 'yield' in my scrapy spider looks as follows :
yield {
'hreflink':mylink,
'Parentlink':response.url
}
This returns me a dict
{
'hreflink':"https://www.southeasthealth.org/wp-content/uploads/Southeast-Health-Standard-Charges-2022.xlsx",
'Parentlink': "https://www.southeasthealth.org/financial-information-price-transparency/"
}
Now, I also want the 'text' that is associated with this particular hreflink, in that particular Parentlink. So my final output should look like
{
'hreflink':"https://www.southeasthealth.org/wp-content/uploads/Southeast-Health-Standard-Charges-2022.xlsx",
'Parentlink': "https://www.southeasthealth.org/financial-information-price-transparency/",
'Yourtext' : "Download Pricing Info"
}
What would be the simplest way to achieve that. I want to use Xpath expressions to get the "text" in a parentlink where href element = #href .
So far Here is what I tied -
Yourtext = response.xpath('//a[#href='+json.dumps(each)+']//text()').get()
but its not printing anything. I tried printing my response and it returns the right page - 'https://www.southeasthealth.org/financial-information-price-transparency/'

If I understand you correctly you want to get the text belonging to the link Download Pricing Info.
I suggest you try using:
response.xpath("//span[#class='fusion-button-text']//text()").get()

I found the answer to my question.
'//a[#href='+json.dumps(each)+']//text()'
This is the correct expression however the href link 'each' is case sensitive and it needs to match exactly for this Xpath to work.

Testcafe deeply equality fails when a css formatted string is compared with a normally constructed string

Trying to compare two sentences (deep equal) with one of them having been created with the help of some css formatting fails. The string I want to compare did have a couple of space separators added with
.space:after {
content: " ";
}
the following are the statements:
stmt1 = 'this is the text'
stmt2 = 'this<span className="space"></span>is the<span className="space"></span>text'
on the UI both appear identical. But when the tests are run in testcafe stmt2 appears as 'thisis thestring'
Can someone help me fix this?

This issue is not related to TestCafe. CSS pseudo elements like :before and :after are not parts of the DOM, so their content is not included in the textContent property. You can try to implement some custom solution using ClientFunction.

AvalonEdit reordering of document lines

We have currently started to evaluate AvalonEdit. We want to use it for a custom language. One of our requirements is to reorder and also to sort document lines by a certain criteria. How can this be accomplished?
Thanks in advance!

AvalonEdit provides the ICSharpCode.AvalonEdit.Document.DocumentLine class but this class just provides meta data on the line's length, starting and ending offset and so on.
In my opinion, there are 2 ways to accomplish your problem
Loop through all lines using TextEditor.LineCount and save the Line into a DocumentLine using TextEditor.Document.GetLineByNumber(int number). Furthermore you can use TextEditor.Document.GetText(DocumentLine.Offset, DocumentLine.Length to get the line's text
Use TextEditor.Text.Split('\n') to get all lines as a string array.
I'd recommend you using the DocumentLine method. Even if you have to use the GetText method in order to get the line's text the meta data on the lines is very nice.
To get all DocumentLines you can use a loop
List<DocumentLine> DocumentLines = new List<DocumentLine>(TextEditor.LineCount - 1);
for (int i = 1; i < TextEditor.LineCount; i++)
{
DocumentLines.Add(TextEditor.Document.GetLineByNumber(i));
}

Using selenium CSS selector for multiple things

This is in Perl if it matters. I have several lists of links that collapse and expand. I know how many there are from using
get_xpath_count('//li/a')
The problem is I need to get a list of the names of these actual links. I've tried using xpath, but haven't found much luck, and was hoping CSS selectors would be able to help. I've tried using
get_text('css=li a:nth-child('.$i.')'
which prints out a [-] icon next to a link, the very top link in the list, and then an out of range error. I'm not familiar was CSS selectors at all, so any help would be great. If I left out important info, please let me know,

Try this (in pseudo-code, because I avoid Perl like the plague):
list linkNames;
count = selenium.get_xpath_count('//li/a');
for (i = 1; i <= count; i++) {
list.append(selenium.get_text('xpath=(//li/a)[' + i +']');
}
Note:
XPath expressions count from 1 to n, not 0 to n-1 like most C-derived languages.
The XPath form for selecting the i'th match of a pattern is (pattern)[i], not pattern[i].
Selenium doesn't assume the (pattern)[i] locator is an XPath, so you need say so by starting it with xpath=.

asp.net: explicit localization & combining strings

This seems like it should be a simple thing to do, but I can't figure it out.
I have a localized resource that I'm using in two places - one as a col. header in a datagrid, and then as a descriptor beside a field when the user edits a row.
The text of the label looks like:
Text="<%$Resources:Global,keyName%>"
However, I'd like to add a trailing : to the label - except if I change the above to
Text="<%$Resources:Global,keyName%>:"
then the : is the only thing that shows up! I've tried it with simple strings, so there's nothing special about the colon char that causes this.
Surely I don't have to have 2 different resources?

Have you tried Text="<%$Resources:Global,keyName%>" + ":" ?
You'd basically be concatenating two strings. Or treat them as two strings
StringBuilder t;
t.append(<%$Resources:Global,keyName%>)
t.append(":")
Text = t;

Assuming you need to keep the : together for styling reasons, replace the label with a span:
<%=Resources.Global.keyName %>:

Well, sometimes the obvious isn't so obvious until someone else looks at it:
Text="<%$Resources:Global,keyName%>" /> :
Just move the : outside the label tag, and all is well.

Develop Reference

r css asp.net wordpress firebase qt symfony nginx http apache-flex

SimpleDom Search Via plaintext Text - simple-html-dom

Related

Get the text associated with a href element in a given page in scrapy

Testcafe deeply equality fails when a css formatted string is compared with a normally constructed string

AvalonEdit reordering of document lines

Using selenium CSS selector for multiple things

asp.net: explicit localization & combining strings

Categories

Resources