How to replace a specific node with XQuery - xquery

I want to replace .<br /> with .<p/> or other tags. And I want to keep my text and other tags unchanged. For example:
<div class="page">
<div class="text">
adsdfasdf<br/>sadasdsafg<br/>kkot.<br/>
<div class="pagenumber">3</div>
</div>
</div>

To solve the question in your title (finding it), find all text nodes which end with a dot, get the first following sibling (if looking for the next following node we would also return nodes like <div>foobar.</div><br/>) and test if it is a <br/> node (in this case, only test the name, I'm not checking for emptiness here).
//text()[ends-with(., '.')]/following-sibling::node()[1]/self::br
Renaming the node - what you asked for in your question body - works like this:
rename node //text()[ends-with(., '.')]/following-sibling::node()[1]/self::br as 'p'
Renaming of course requires XQuery Update. Only finding that node also works with any XQuery 1.0 and even XPath 2.0 processors.

Related

Simplifying long CSS selectors

I have the following CSS selector:
#AllContextMenus :not(.menu-iconic-left):not(.menu-accel):not(.menu-accel-left):not(.menu-accel-container):not(.menu-accel-container-left):not(.menu-iconic-accel):not(.menu-right)::before
For readability purposes, I like to keep all code lines under 100 characters.
Is there any way to simplify, optimize, or write this CSS selector without changing what it matches and without reducing performance?
For example, is there any type of "and" operator that can be used within :not()?
You generally can't simplify a selector without changing the semantics of what it matches.
But you can break a selector up into multiple lines at many points to meet maximum line length requirements. Just use a comment and put the line break inside the comment. Like this:
#AllContextMenus :not(.menu-iconic-left)/*
*/:not(.menu-accel)/*
*/:not(.menu-accel-left)/*
*/:not(.menu-accel-container)/*
*/:not(.menu-accel-container-left)/*
*/:not(.menu-iconic-accel)/*
*/:not(.menu-right)::before
#AllContextMenus :not(.menu-iconic-left)/*
*/:not(.menu-accel)/*
*/:not(.menu-accel-left)/*
*/:not(.menu-accel-container)/*
*/:not(.menu-accel-container-left)/*
*/:not(.menu-iconic-accel)/*
*/:not(.menu-right)::before {
color:red;
content:'TEST '
}
<section id="AllContextMenus">
<div class="a">A</div>
<div class="menu-iconic-accel">menu-iconic-accel</div>
</section>

Scraping for a rank number using Nokogiri in Ruby

I'm still doing some web scraping practice using this article:
https://www.pastemagazine.com/articles/2018/01/the-75-best-tv-shows-on-netflix-2018.html
I'd like to get just the rank number of each show and found what I think is the HTML element:
<div class="copy entry manual-ads">
<p>
<b class="big">
"75."
<i>
Chewing Gum
</i>
</b>
</p>
</div>
I'm using the following code to grab just the rank number (in this case, "75."):
doc.css("b.big").text
However, it returns the rank number along with the show title. How can I get just the rank number?
Use regex:
doc.css("b.big").text[/\d+/]

selenium xpath to fetch the value of element which has no id associated to it

I wanted to fetch the value 606 from the following code for selenium
<div class="price pad-15 per-person budget-pp marg-left-10 ">
<span>From</span>
<h2 class="size-28 dis-inblock">
<span class="size-22">£</span>
606
</h2>
<span>Per person</span>
</div>
Can anyone please help me with identifying xpath for the value 606. Thanks in advance.
XPath for element that contains 606 is:
//h2[span[text()="£"]]
You can fetch value with appropriate method in your programming language (like .get_attribute("text") or .text in Python')
Let me know in case of any issues
//div/h2/text() here is enough.
//text()[. = '606'] would be too (but I doubt it's what you require here!)
You can use below cssSelector as well :-
div.price.per-person > h2
(Assuming you're using Java) Now you can use WebElement#getText() to fetching the desired text after locating element using above selector, this would fetch text as £606, you can use some programming stuff to omit £ and get actual value which you want.

How can I use custom classes in chapter titles with Asciidoc epub3 converter?

In the adoc file I define a chapter header like:
== [big-number]#2064# Das Spiele-Labor
For HTML that translates to
<span class="big-number">2064</span>
For the epub-Version, converted with asciidoctor-epub, apparently the class is omitted. The code line in the converter.rb:
<h1 class="chapter-title">#{title_upper}#{subtitle ? %[ <small class="subtitle">#{subtitle_formatted_upper}</small>] : nil}</h1>
(/var/lib/gems/1.9.1/gems/asciidoctor-epub3-1.5.0.alpha.7.dev/lib/asciidoctor-epub3/converter.rb)
How can I get the class information over to the chapter-title to format the first number in a special way?
Or is there another way to solve this? (The first number of the chapter title should be large and CSS hasn't got a 'first-word' attribute)

Please what is the difference between [attribute~=value] and [attribute*=value]

cannot find the difference between these two selectors. Both seem to do the same thing i.e select tags based on a specific attribute value containing a given string.
For [attribute~=value] : http://www.w3schools.com/cssref/sel_attribute_value_contains.asp
For [attribute*=value] : http://www.w3schools.com/cssref/sel_attr_contain.asp
The first one ([attribute~=value]) is a whitespace-separated search...
<!-- Would match -->
<div class="value another"></div>
...and the second ([attribute*=value]) is a substring search...
<!-- Would match -->
<div class="a_value"></div>
W3Schools doesn't appear to make this distinction very clear. Use a better resource.
[attribute~="value"] selects elements that contain a given word delimited by spaces while [attribute*="value"] selects elements that contain the given substring.
For example, [data-test~="value"] would not match on the below div while [data-test*="value"] would.
<div data-test="my values go here"></div>

Resources