Using selenium CSS selector for multiple things - css

This is in Perl if it matters. I have several lists of links that collapse and expand. I know how many there are from using
get_xpath_count('//li/a')
The problem is I need to get a list of the names of these actual links. I've tried using xpath, but haven't found much luck, and was hoping CSS selectors would be able to help. I've tried using
get_text('css=li a:nth-child('.$i.')'
which prints out a [-] icon next to a link, the very top link in the list, and then an out of range error. I'm not familiar was CSS selectors at all, so any help would be great. If I left out important info, please let me know,

Try this (in pseudo-code, because I avoid Perl like the plague):
list linkNames;
count = selenium.get_xpath_count('//li/a');
for (i = 1; i <= count; i++) {
list.append(selenium.get_text('xpath=(//li/a)[' + i +']');
}
Note:
XPath expressions count from 1 to n, not 0 to n-1 like most C-derived languages.
The XPath form for selecting the i'th match of a pattern is (pattern)[i], not pattern[i].
Selenium doesn't assume the (pattern)[i] locator is an XPath, so you need say so by starting it with xpath=.

Related

SimpleDom Search Via plaintext Text

I am using "PHP Simple HTML DOM Parser" library and looking forward to find elements based on its text value (plaintext)
For example i need to find span element using its value "Red".
<span class="color">Red</span>
I was expecting bellow code to work but seems that it just replaces the value instead of searching it.
$brand = $html->find('span',0)->plaintext='Red';
I read Manual and also i tried to look in library code itself but was not able to find the solution, kindly advise if i am missing something or it is simply not possible to do via Simple Html DOM Parser.
P.S
Kindly note that i am aware of other ways like regex.
Using $html->find('span', 0) will find the (N)th span where in this case n is zero.
Using $html->find('span',0)->plaintext='Red'; will set the plaintext to Red
If you want to find the elements where the text is Red you could use a loop and omit the 0 to find all the spans.
For example, using innertext instead of plaintext:
$spansWithRedText = [];
foreach($html->find('span') as $element) {
if ($element->innertext === "Red") {
$spansWithRedText[] = $element;
}
}

Forming a sequence of css selectors as argument to pup to get particular value from javadoc html

It isn't often that I attempt to implement something that attempts to integrate three different languages (four, if you count bash), sort of.
I want to write a little tool that scans the HTML files in the Java JDK javadoc package, focusing on blocks like the following:
<dl>
<dt><span class="simpleTagLabel">Since:</span></dt>
<dd>1.8</dd>
</dl>
I want to get the "1.8" value out of this.
So, I figured I would find a command-line tool that can parse HTML and figure out how to extract this.
I found the "pup" tool (which is written in "go"), and it seems to be close, but I now have to figure out the correct sequence of CSS selectors to get to this element. I've tried several variations, but nothing that really is doing what I need.
Update:
The answer from Sølve Tornøe comes close, and in fact I can implement somewhat of a kludge to get the data I want.
If I just use 'dl dt + dd', it gives me a lot of elements that match that pattern. Ideally, I wish I could do something like 'dl dt (> span[class="simpleTagLabel"]) + dd', where the "> span ..." thing is used for matching, but having it "pop back up" after matching the span, so it can look for peers of "dt". I imagine there's no way to do this in CSS.
My big kludge workaround is to assume that all of my real candidate elements have the text "1." in them. With that big assumption, I can use 'dl dt + dd:contains("1.")'. This at least works with the data I'm working with.
You can combine >(child) +(Adjacent sibling) element(dl tag..) to the following combination:
dl > dt + dd
This translates to: Give me the element that is a dd tag and is an Adjacent sibling of dt that also is a child of dl
console.log(document.querySelector('dl > dt + dd').innerText)
dl > dt + dd {
color: salmon;
}
<dl>
<dt><span class="simpleTagLabel">Since:</span></dt>
<dd>1.8</dd>
</dl>
If you're willing to use XPath instead of css selectors, you can easily step up through parent nodes of matched elements. This can be done with the perl XML::XPath command line tool, or xmllint:
$ xpath -q -e "//dt/span[contains(#class,'simpleTagLabel')]/../../dd/text()" < test.html
1.8
$ xmllint --xpath "//dt/span[contains(#class,'simpleTagLabel')]/../../dd/text()" test.html
1.8

Efficient way to check if an element is in a comma separated string?

I have a blacklist like "12,3,4,5,6,789",
I tried
set = {}
for element in string.gmatch("12,3,4,5,6,789", "([^"..", ".."]+)") do
set[element] = true
end
if set[...] then
...
end
to check if an element is in the blacklist.
My program will process more than one (element,blacklist) pair per request, for each pair i build a set and only use it once.
I thought it's inefficient and tried to use string.match, but the pattern in lua is not standard RegEx and I failed to write a pattern that can match element at start/mid/end of the blacklist correctly at the same time.
Will string.match be more efficient than build a set?
How to write a proper pattern?
Is there any way more efficient?
Pattern matching is easiest when there are no corner cases:
string.match(","..blacklist..",",
","..element..",")
short solution:
local blacklist = "31,415,9265,3589,7932,3846,2643,383,279"
local item = "383"
blacklist= ","..blacklist..","
if blacklist:find(","..item..",") then
print("found in the blacklist")
end
ps:this is my understanding of the original task

Selenium IDE - Select checkbox on table row

I'm using Selenium IDE and I have a table where it has many rowns and columns. Each row has its own checkbox to select this row.
I was using this command to search for a specific row:
css=tr:contains('US Tester4') input[type="checkbox"]
But the problem is that in this colum, I have some other similar words like "US Tester41", "US Tester42" ... and when I use this command, it selects the wrong row.
I thought if I replace this word "contains" for some other like "equals" or "exactly" would work, but it didn't (I don't know the sintax).
Any ideas?
Follow the screenshot:
http://oi41.tinypic.com/2ake9hw.jpg
I'm not familiar with Selenium IDE, but with the selenium webdriver I would use an xpath. So I guess something like this will work for you:
xpath=//tr[td[3][text()='US Tester4']]//input[#type='checkbox']
This worked for me:
//tr//td[.='US Tester4']//input[type="checkbox"]
against:
<table>
<tr><td>US Tester</td>input(type="checkbox")</tr>
<tr><td>US Tester4</td>input(type="checkbox")</tr>
<tr><td>US Tester41</td>input(type="checkbox")</tr>
<tr><td>US Tester412</td>input(type="checkbox")</tr>
</table>
It matched the second element.
This worked for me
xpath=(//input[#name='uid'])[2])
The 2 being the order of elemets
I'm not very familiar with the IDE but I have used the Webdriver before. If possible I would use this xpath.
xpath = "//td[.= 'US Tester4']//previous-sibling::td//input[#type = 'checkbox']"
This should locate only one element on screen. Using previous-sibling and following-sibling is very helpful when you haven't got a good enough identifier on the exact element you want to find. In your case the which contains the checkbox hasn't a good identifier where as the after has text which you could match using the '=' operator. You just need to use the 'previous-sibling' to find the with the checkbox

How do I match individual CSS attributes using RegEx

I'm trying to expand a minified CSS file (don't ask) to make it human readable.
I've managed to get most of the expanding done but I'm stuck at a very weird case that I can't figure out.
I have CSS that looks like this:
.innerRight {
border:0;color:#000;width:auto;padding-top:0;margin:0;
}
a {
color:#000;text-decoration:underline;font-size:12px;
}
p,small,ul,li {
color:#000;font-size:12px;padding:0;
}
I've tried (.+):(.+); as the search and \t\1: \2;\n as the replace. The find RegEx is valid, the only problem is that it matches the entire line of attributes. I've tried the non-greedy character, but I must not be putting it in the right place.
What the above find RegEx matches is:
0: border:0;color:#000;width:auto;padding-top:0;margin:0;
1: color:#000;text-decoration:underline;font-size:12px;
2: color:#000;font-size:12px;padding:0;
While those are technically correct matches, I need it to match border:0;, color:#000;, etc separately for my replace to work.
Try this - use non-greedy matching. This works for me
(.+?):(.+?);
Forget the colon. Just replace all semicolons with ";\n".
In Javascript, for example, you could write:
text = text.replace(/;/gm,";\n");
I would further refine that to address leading-space issues, etc., but this will put every style rule on its own line.

Resources