Special characters handling RTF fields - tridion

We did upgrade from Tridion 5.3 to Tridion 2011 SP1.
In Tridion 5.3 we were using VBScript Templates, as a part of this upgrade we are converting existing VBScript Templates to Compound Component Templates. We are facing below mentioned two issues with the content of RTF field.
Issue 1 : In our existing content of RTF field we have empty tags/HTML Tags at number of locations. eg.<a name="Contact" id="Contact"></a> When we publish the content with Compound Component Templates (Tridion 2011 SP1 environment) above mentioned anchor tag is getting converted to <a name="Contact" id="Contact" />. This is breaking existing javascript functionality. To overcome the issue we have written C# functions which finds empty tags and replace the inner text with like <a name="Contact" id="Contact"> </a> then things are working fine. But to call this function at CT level for each RTF field is big activity as we have number Component Tempate. Is there any better way to do it.
Issue 2 : In the same RTF field we have content like   (may be editors have copy pasted it from web or somewhere), so when we try to publish page or component it is getting failed with error.
JScriptException:Expression valueUnterminated String Constant.
Is there any default TBB which will help to fix the issues?

Issue 1:
You can also use a Filtering XSLT to modify your RTF content on component save.
This way you can replace any empty tag <tag></tag> with <tag> </tag> on component save and don't need any further change on templating.
Issue 2:
  seems like an encoded , see character codes: http://www.escapecodes.info/
Maybe you can replace this character codes with the proper html encoding, using a filtering xslt or a C# TBB

As you already have a function for converting inline closed anchor tags to anchor tags with a non-breaking space in them you could consider using this function from your page template(s) instead of using it in every component template; this would require a much smaller number of templates to change...
You also might want to consider replacing inline closed anchor tags with properly closed anchor tags without actually inserting extra spaces.
Below is a C# fragment you can use in a TBB to replace inline closed anchor tags:
Item outputItem = package.GetByName(pcakge.OutputName);
package.Remove(outputItem);
string outputString = Regex.Replace(outputItem.GetAsString(), "(<a[^>]*?)/>", "$1></a>", RegexOptions.Singleline);
outputItem.SetAsString(outputString);
package.PushItem(Package.OutputName, outputItem);
You could extend it to also replace   with but this should not cause any issues as   is a valid escape sequence in HTML (Tridion RTF fields are essentially XML which could be the cause for   appearing instead of ...).

Related

How to elegantly modify html to inject html element after x-th paragraph on the server side?

I need to modify html coming from external file (server side) before I render it and inject a quote 'component' like this:
This component needs to be injected after 2nd paragraph and I'm planning to use htmlagillity pack. Any examples? Is HtmlNode.InsertAfter() method good choice once I found third paragraph which should be trivial.
Another question is would it be possible to inject sitecore placeholder or even usercontrol that is going to render my quote instead of pure html? I feel it should be but not sure what would be good approach.
Thanks
I can suggest two possible approaches here:
1) Use snippets with some customisation. Snippets allow users to insert pre-defined chunks of HTML into a RTE field. You could have a pre-defined piece of HTML which might have some identifier to indicate it should use custom processing (I would suggest some data-xxx style attribute which would not conflict with any CSS or JavaScript). Then you could create a new renderField pipeline processor which would detect the data-xxx attribute within the content of a rich text field - you would use HtmlAgilityPack for this and then replace that snippet with the contents of your server-side file.
-or-
2) Split your text content into two separate chunks and have two instances of a "HtmlText" rendering within the placeholder, with a rendering for your quote text between them in the same placeholder.
I would advise that having a rule to insert text after the second paragraph would be quite 'brittle' as this would be very reliant on content editors setting the rich text field contents in quite a precise way e.g. to always ensure two or more paragraphs and to always break text with paragraphs - they might decide to use a load of line breaks instead to split their text. That said if you did do this, you would create a new renderField pipeline processor.

Creating custom layouts for Images in page content TYPO3 6

Typo3 provides option to add multiple images to a page content, but all the images are wrapped under some default <div> tags. I want these images to be wrapped under <ul> and <li> tags instead and giving my own custom CSS ids and classes to it.
There are not many resources on TYPO3 for me to approach this issue. Can TYPO3 allow to use custom tags for the page content elements?
UPDATE
From Jost's answer was able to get my images displayed, but how do I split the image details?
My each image will have title, alt-text, image-path and image-link. Now, using TypoScript how do I retrieve this, because each details has to go in separate tags.
Check the TypoScript object browser. There you will find the object tt_content, which contains the rendering definitions for content elements. The rendering definition for images is found at tt_content.image.20, for example
tt_content.image.20.imageStdWrap.dataWrap = <div class="csc-textpic-imagewrap" style="width:{register:totalwidth}px;"> | </div>
The default definitions given there are usually provided by the static TypoScript of CSS-styled-content. You can overwrite them in your own TS, but when updating to a newer TYPO3-version, the default template may change, which could result in additional wrappers.
Update
Most content rendering in TYPO3 is defined in the TypoScript object tt_content. You can browse all TS-objects that will be used on a page by selecting the "Template" module and the page in question, and then choose "TypoScript Object Browser" in the selectbox at the top of the window. To understand what that stuff means, knowledge of TypoScript is necessary (Tutorial, Reference).
You can add your own TypoScript, which may override existing settings. You can do that in the Template-module too, but usually this is done by creating a file containing the script somewhere in the fileadmin folder and including it from the Template module.
The above enables you to edit the markup of the page. (Additional) CSS is usually defined in external files, that are included by a PAGE object (see the reference about that).
This post is a bit older but I want to add the following:
If you want to understand how the different content elements are wrapped, you may have a look into the css_styled_content extension. I assume that you have included the "Static Template (from extension)" in your main Typoscript template.
You can find the setup.txt here:
typo3/sysext/css_styled_content/static/setup.txt
There you´ll find the line Jost mentioned in line 860 (TYPO3 version 6.1), for example. And of course a lot of other definitions, too.
But check to read the documentation and tutorials on typo3.org.
HTH
merzilla

Adding self closing tags in RTF field

We did an upgrade from Tridion 5.3 to Tridion 2011 SP1.
In our existing content at so many place in RTF field we are using html element like <a name="top" id="top"></a>. When we publish component/page from tridion anchors <a> tags are getting converted to self closing anchor tags <a name="top" id="top" />. Because of this hyperlink is getting formed on entire content of RTF field, as browser is treating this tag a start tag of anchor <a>. When we check page source in FireFox it says "Self-closing syntax ("/>")" used on non-void HTML element. Ignoring the slashes and treating as a start tag. To fix this we update the existing content to <a name="top" id="top"> </a> it is working fine but not a good solution. Any other ideas/configuration, so that it will not be converted to self closing tags.
I have a similar question about this here
I have posted my work around there. Hope it helps.
I am not sure what kind of templates you are using, but generally I post process my output and look for any empty tags using an XSLT and the XSLT Mediator. When I find empty tags I tend to convert them to contain empty text to prvent any issues in the browsers viewing the final content.
<div></div> or <div/>
will get converted to
<div> </div>
Whilst the first examples are technically valid XML, they do (as you have discovered) break several browsers.

Seeing html code in tooltips in asp.net

I am parsing web service xml and populating a treeview in asp.net. I'm trying to display one of the xml node attributes as a tooltip, but that attribute happens to sometimes have html tags in it. I know there seem to be some custom tooltip stuff out there, but I don't have the time or the experience to play with those yet. Is there no way to easily remove such code or translate it into the textual equivalent? I know I can replace br tags with environment.newline, but I don't want to have to do this for every conceivable html tag that might be embeded in the content!
The HTML Agilty Pack is an HTML parser that can read HTML fragments - you can do that and then read the InnerText property of the top node. The effect will be a textual version of the HTML.

How to strip out character references inserted by Diazo into text nodes

I'm using plone.app.theming 1.0b5 and Plone 4.1 rc3. Our Diazo rules contain an number of external content includes and we're using one such include to insert Google Analytics script into the result:
<append theme="/html/head" content="/html/head/script"
href="##standard-page-elements" />
Unfortunately the script tag from the view is being mangled during the transform such that any carriage returns are converted to character references (
)
This is due to the way lxml serializes and deserializes (see this Plone bug report).
I'd like a work around in the meantime but can't figure out a Diazo rule that would strip these references out.
As noted above:
The bug has been fixed in the trunk of Diazo (thanks Laurence) so I no longer need to do this. I didn't manage to figure it out: it doesn't seem to me you can alter external content through Diazo, only the main content.

Resources