I have an XSLT function that takes a regular expression as a parameter but the XSLT parser does not like it.
Here is the code:
<xsl:value-of select='ns:RegexReplace($variable, "", "style=\"\w+\:\s\w+;\"")' disable-output-escaping='yes' />
I found this:
http://www.xml.com/pub/a/2003/06/04/tr.html <-- but it is using what I am and seems to work (for them). Do I just have a rubbish parser??
Is there any way of doing this?
Or, a way of forcing an element to ignore inline style via a CSS trick?
You seem to be trying to include quotes in a quote-delimited XPath string literal by escaping them with a backslash. That does not work.
In XPath 1.0 (XSLT 1), there is no nice way to do this. You may need to resort to tricks like defining a variable which holds a single quote character and using the concat function to create your string:
<xsl:variable name='quot' select="'"'"/>
<xsl:value-of select='concat("a string with a quote ", $quot, " character")'/>
In XPath 2.0 (XSLT 2), you can escape a quote with another quote:
<xsl:value-of select='"a string with a quote "" character"'/>
It occurs to me that you may be trying to remove style attributes. If that is the case, then string replacement is not going to help you.
You can remove style attributes for example by writing a template which matches them and outputs nothing:
<xsl:template match="#style"/>
Related
I have a html like this.
<div id="video1" value="<iframe src="https://www.move.com/99"></iframe>"
class="movie"></div>
I want to get url [ https://www.movie.com/99 ] by useing Xpath.
However, the escape characters and other make it difficult.
How to get it by useing Xpath or other means.
If you have an escaped XML document within an attribute or text node of an outer document, then the only way you can use XPath to probe into the inner document is to parse it first. In XPath 3.1 you can do
parse-xml(div/#value)/iframe/#src
but that's not possible in older XPath versions.
An easy approach would be using substring functions like this:
substring-before(substring-after(div[#id='video1' and #class='movie']/#value,'"'),'"')
This expression selects the string between two quotes ("= ") of the #value attribute.
Regex pattern (?i)(?<=<data name=")\w+(?=") can capture test of
<data name="test" xml:space="preserve">
<value>123</value>
</data>
But what does the "(?i)" mean in regex?
It's a way of specifying that the matching should be case insensitive.
Here's the MSDN page on Regex options:
By applying inline options in a regular expression pattern with the syntax (?imnsx-imnsx). The option applies to the pattern from the point that the option is defined to either the end of the pattern or to the point at which the option is undefined by another inline option.
But really, it looks like you're processing XML, in which case, you should really be using an XML parser, not regular expressions. There are classes built into the framework for working with XML which properly respect all of the rules of XML. Treating XML as "just a string" tends to lead to brittle solutions.
I am trying this in an XQuery (assume that doc('input:instance') does indeed return a valid XML document) which is generated using XSLT
let $a:= <xsl:text>"<xsl:copy-of select="doc('input:instance')//A" />"</xsl:text>
let $p := <xsl:text>"<xsl:copy-of select="doc('input:instance')//P" />"</xsl:text>
let $r := <xsl:text>"<xsl:copy-of select="doc('input:instance')//R" />"</xsl:text>
But I get the error:
xsl:text must not contain child elements
How do I retrieve XML results using the XPath in xsl:copy-of and then encode the special characters received in the result while formatting the result as string? I would be happy to use CDATA section if that's possible (if I do that instead of xsl:text above, xsl:copy-of is not evaluated since it becomes part of CDATA section).
Obviously I am a newcomer to XSL...
What you need here is the ability to serialize an XML document (here the document returned by doc()) using the XML serialization, into a string.
Various XQuery implementation have extension functions for this purpose. For example, if you are using Saxon:
saxon:serialize(document, 'xml')
This has nothing to do with XQuery (you could be building the XSLT stylesheet with any language, even XSLT itslef!).
From http://www.w3.org/TR/xslt20/#xsl-text
<!-- Category: instruction -->
<xsl:text
[disable-output-escaping]? = "yes" | "no">
<!-- Content: #PCDATA -->
</xsl:text>
[...] The content of the xsl:text
element is a single text node whose
value forms the string value of the
new text node.
hi anyone knows how to replace a html tag in a string with specific characters:
e.g.
string s1 = "<span style="italic">inluding <span style="bold">other</span> tags </span>";
string s2 = "<span style="italic">inluding </span><span style="bold">other tags </span>";
i want to replace "span" with "bold" to "bOpen" and "bClose" and to replace "span" with "italic" to "iOpen" and "iClose" in both c# and javascript.
thanks very much.
thanks for the response, i did use regular expression to do that: res = Regex.Replace(res, ".*?", replaceHtmlBold); but it cant match the nested tag and none-nested tag at the same time. could you help please?
JavaScript's String Object has a handy function that lets you replace words that occur within the string. So does C#.
Regular expressions are your friends here. I could give you the exact code for your problem, but then you'll miss the point of learning this technique. Here is an Introduction to Regular Expressions and there is this article "C# Regular Expressions". If you need more, Google is your friend.
Good luck!
PS: I realized now what the real problem is. I think you can get away with lookaround techniques and conditionals. Both are summarized here.
I use XSLT to transform an XML document which I then load on to a ASP.NET website. However, if the XML contains '<' characters, the XML becomes malformed.
<title><b> < left arrows <b></title>
If I use disable-output-escaping="yes", the XML cannot be loaded and I get the error "Name cannot begin with the '' character".
If I do not disable output escaping the escaped characters are disregarded and the text appears as it is:
<title><b> < left arrows <b></title>
I want the bold tags to work, but I also want to escape the '<' character. Ideally
<b>< left arrows</b>
is what I want to achieve. Is there any solution for this?
The XML should contain the escaped sequence for the less than sign (<), not the literal < character. The XML is malformed and any XML parser must reject it.
In XSLT you could generate that sequence like this:
<xsl:text><<xsl:text>
From what I understand, the input contains HTML and literal < characters. In that case, disable-output-escaping="yes" will preserve the HTML tags but produce invalid XML and setting it to no means the HTML tags will be escaped.
What you need to do is to leave set disable-output-escaping="no" (which is the default, you don't actually have to add that) and add a XSLT rule that will copy the HTML tags. For instance:
<xsl:template match="*">
<xsl:copy>
<xsl:copy-of select="#*" />
<xsl:apply-templates />
</xsl:copy>
</xsl:template>
I came up with a solution and was triggered by the last answer by Josh. Thanks Josh. I tried to used the match template, however I had a problem as the html tags are placed within cdata, so I had difficulties doing a match. There might be a way to do it, but I gave up on that.
What I did was to do a test="contain($text, $replace)" where the $replace is the '<' character and on top of that, I also added a condition to test if the substring after the '<' is a relevant html tag such that it is actually a <b> or </b>. So if it's just a '<' character not belonging to any html tags, I will convert '<' to ampersand, <. Basically that solved my problem. Hope this is useful to anyone who encounter the same problem as me.