I am generating a Tridion binary link as follows:
Razor TBB generates Link Resolver syntax for binary link
Link resolver TBB generates TCDL.
The output is published as a Dynamic component template of output type REL. The publication target specifies ASP.NET.
What I see in the COMPONENT_PRESENTATIONS table of the broker database is output like this:
<tcdl:Link type="binary" origin="tcm:0-0-0"
destination="tcm:34-669" templateURI="tcm:0-0-0"
linkAttributes="" textOnFail="true" addAnchor=""
variantId="">Document2</tcdl:Link>
so you'd expect at the least to see the text "Document2"
If I hand-craft a binary link control <tridion:BinaryLink..../> this works just fine, however there is no visible output generated by the TCDL listed above.
What might be going wrong? What should I investigate next?
We noticed the same behavior that the Link Resolver TBB does not generate the right case for binaries type. It being generated as <tcdl:Link type="binary" ../> instead of <tcdl:Link type="Binary" ../> (note the lower case b instead of Uppercase B, tough one to catch). The REL TCDLTagRender is case sensitive and does not resolve the tcdl:link with lowercase type:binary and you will see the warning message in cd log files (assume you have log level set to warn or debug).
"WARN LinkTagRenderer - Link type does not exist."
The work around is to replace the output of lowercase binary with the uppercase Binary by introducing a new TBB. We included this as part of the TBB to resolve the RTF field binary link resolving for any multimedia linking like pdf, doc etc..
You do a string replace the lowercase binary with Binary as below in the TBB.
string output = package.GetValue(Package.OutputName);
output = output.Replace("type=\"binary\"", "type=\"Binary\"" );
Related
Currently, I'm working on a feature that involves parsing XML that we receive from another product. I decided to run some tests against some actual customer data, and it looks like the other product is allowing input from users that should be considered invalid. Anyways, I still have to try and figure out a way to parse it. We're using javax.xml.parsers.DocumentBuilder and I'm getting an error on input that looks like the following.
<xml>
...
<description>Example:Description:<THIS-IS-PART-OF-DESCRIPTION></description>
...
</xml>
As you can tell, the description has what appears to be an invalid tag inside of it (<THIS-IS-PART-OF-DESCRIPTION>). Now, this description tag is known to be a leaf tag and shouldn't have any nested tags inside of it. Regardless, this is still an issue and yields an exception on DocumentBuilder.parse(...)
I know this is invalid XML, but it's predictably invalid. Any ideas on a way to parse such input?
That "XML" is worse than invalid – it's not well-formed; see Well Formed vs Valid XML.
An informal assessment of the predictability of the transgressions does not help. That textual data is not XML. No conformant XML tools or libraries can help you process it.
Options, most desirable first:
Have the provider fix the problem on their end. Demand well-formed XML. (Technically the phrase well-formed XML is redundant but may be useful for emphasis.)
Use a tolerant markup parser to cleanup the problem ahead of parsing as XML:
Standalone: xmlstarlet has robust recovering and repair capabilities credit: RomanPerekhrest
xmlstarlet fo -o -R -H -D bad.xml 2>/dev/null
Standalone and C/C++: HTML Tidy works with XML too. Taggle is a port of TagSoup to C++.
Python: Beautiful Soup is Python-based. See notes in the Differences between parsers section. See also answers to this question for more
suggestions for dealing with not-well-formed markup in Python,
including especially lxml's recover=True option.
See also this answer for how to use codecs.EncodedFile() to cleanup illegal characters.
Java: TagSoup and JSoup focus on HTML. FilterInputStream can be used for preprocessing cleanup.
.NET:
XmlReaderSettings.CheckCharacters can
be disabled to get past illegal XML character problems.
#jdweng notes that XmlReaderSettings.ConformanceLevel can be set to
ConformanceLevel.Fragment so that XmlReader can read XML Well-Formed Parsed Entities lacking a root element.
#jdweng also reports that XmlReader.ReadToFollowing() can sometimes
be used to work-around XML syntactical issues, but note
rule-breaking warning in #3 below.
Microsoft.Language.Xml.XMLParser is said to be “error-tolerant”.
Go: Set Decoder.Strict to false as shown in this example by #chuckx.
PHP: See DOMDocument::$recover and libxml_use_internal_errors(true). See nice example here.
Ruby: Nokogiri supports “Gentle Well-Formedness”.
R: See htmlTreeParse() for fault-tolerant markup parsing in R.
Perl: See XML::Liberal, a "super liberal XML parser that parses broken XML."
Process the data as text manually using a text editor or
programmatically using character/string functions. Doing this
programmatically can range from tricky to impossible as
what appears to be
predictable often is not -- rule breaking is rarely bound by rules.
For invalid character errors, use regex to remove/replace invalid characters:
PHP: preg_replace('/[^\x{0009}\x{000a}\x{000d}\x{0020}-\x{D7FF}\x{E000}-\x{FFFD}]+/u', ' ', $s);
Ruby: string.tr("^\u{0009}\u{000a}\u{000d}\u{0020}-\u{D7FF}\u{E000}-\u{FFFD}", ' ')
JavaScript: inputStr.replace(/[^\x09\x0A\x0D\x20-\xFF\x85\xA0-\uD7FF\uE000-\uFDCF\uFDE0-\uFFFD]/gm, '')
For ampersands, use regex to replace matches with &: credit: blhsin, demo
&(?!(?:#\d+|#x[0-9a-f]+|\w+);)
Note that the above regular expressions won't take comments or CDATA
sections into account.
A standard XML parser will NEVER accept invalid XML, by design.
Your only option is to pre-process the input to remove the "predictably invalid" content, or wrap it in CDATA, prior to parsing it.
The accepted answer is good advice, and contains very useful links.
I'd like to add that this, and many other cases of not-wellformed and/or DTD-invalid XML can be repaired using SGML, the ISO-standardized superset of HTML and XML. In your case, what works is to declare the bogus THIS-IS-PART-OF-DESCRIPTION element as SGML empty element and then use eg. the osx program (part of the OpenSP/OpenJade SGML package) to convert it to XML. For example, if you supply the following to osx
<!DOCTYPE xml [
<!ELEMENT xml - - ANY>
<!ELEMENT description - - ANY>
<!ELEMENT THIS-IS-PART-OF-DESCRIPTION - - EMPTY>
]>
<xml>
<description>blah blah
<THIS-IS-PART-OF-DESCRIPTION>
</description>
</xml>
it will output well-formed XML for further processing with the XML tools of your choice.
Note, however, that your example snippet has another problem in that element names starting with the letters xml or XML or Xml etc. are reserved in XML, and won't be accepted by conforming XML parsers.
IMO these cases should be solved by using JSoup.
Below is a not-really answer for this specific case, but found this on the web (thanks to inuyasha82 on Coderwall). This code bit did inspire me for another similar problem while dealing with malformed XMLs, so I share it here.
Please do not edit what is below, as it is as it on the original website.
The XML format, requires to be valid a unique root element declared in the document.
So for example a valid xml is:
<root>
<element>...</element>
<element>...</element>
</root>
But if you have a document like:
<element>...</element>
<element>...</element>
<element>...</element>
<element>...</element>
This will be considered a malformed XML, so many xml parsers just throw an Exception complaining about no root element. Etc.
In this example there is a solution on how to solve that problem and succesfully parse the malformed xml above.
Basically what we will do is to add programmatically a root element.
So first of all you have to open the resource that contains your "malformed" xml (i. e. a file):
File file = new File(pathtofile);
Then open a FileInputStream:
FileInputStream fis = new FileInputStream(file);
If we try to parse this stream with any XML library at that point we will raise the malformed document Exception.
Now we create a list of InputStream objects with three lements:
A ByteIputStream element that contains the string: <root>
Our FileInputStream
A ByteInputStream with the string: </root>
So the code is:
List<InputStream> streams =
Arrays.asList(
new ByteArrayInputStream("<root>".getBytes()),
fis,
new ByteArrayInputStream("</root>".getBytes()));
Now using a SequenceInputStream, we create a container for the List created above:
InputStream cntr =
new SequenceInputStream(Collections.enumeration(str));
Now we can use any XML Parser library, on the cntr, and it will be parsed without any problem. (Checked with Stax library);
We have a windows forms legacy asp.net site that uses the AjaxFileUpload control to manage file uploads. One of our issues is that we have different file type uploads but these types are distinguished not by the extension, but by an element right before the extnsion, EG: .gh.zip vs. .gy.zip. It seems that if I add one of these, but not the other, to the AllowedFileTypes, it doesn't allow either. Is it possible to piggyback some additional JS validation code to prevent an invalid file name, or would I need to replace the entire module with something else, and if so, what would be the recommendation for something that's going to be the least time-consuming that will offer a reasonable amount of configuratability?
That control is open source - you can download the source and change it if you wish.
However, why would not just specifying zip as allowed file type work?
If I set a allowed extension of zip?
Then all of these work:
.gh.zip ok
.gy.zip ok
.pdf no
However, my markup is this:
<ajaxToolkit:AjaxFileUpload ID="AjaxFileUpload1" runat="server"
OnClientUploadCompleteAll="MyCompleteAll" ChunkSize="16384"
AllowedFileTypes="zip"
/>
So, above only allows zip files.
if I try to say add a pdf file to above que, then I get this:
So just add allowed extension type = zip
(Edit: do NOT include the "." in this extension)
I not sure why that would not work?
But as noted, you can grab the source - it is open source code now.
However, I suspect perhaps some other issue is going on here?
Or maybe you need "more" complex file extensions parsing?
I mean, you could for the "rare" cases or say some "out liner" cases allow that file up-load, and THEN the post-processing code could reject the file type anyway, right?
However, looking at above, just specify file type = zip, and you should be ok.
Currently, I'm working on a feature that involves parsing XML that we receive from another product. I decided to run some tests against some actual customer data, and it looks like the other product is allowing input from users that should be considered invalid. Anyways, I still have to try and figure out a way to parse it. We're using javax.xml.parsers.DocumentBuilder and I'm getting an error on input that looks like the following.
<xml>
...
<description>Example:Description:<THIS-IS-PART-OF-DESCRIPTION></description>
...
</xml>
As you can tell, the description has what appears to be an invalid tag inside of it (<THIS-IS-PART-OF-DESCRIPTION>). Now, this description tag is known to be a leaf tag and shouldn't have any nested tags inside of it. Regardless, this is still an issue and yields an exception on DocumentBuilder.parse(...)
I know this is invalid XML, but it's predictably invalid. Any ideas on a way to parse such input?
That "XML" is worse than invalid – it's not well-formed; see Well Formed vs Valid XML.
An informal assessment of the predictability of the transgressions does not help. That textual data is not XML. No conformant XML tools or libraries can help you process it.
Options, most desirable first:
Have the provider fix the problem on their end. Demand well-formed XML. (Technically the phrase well-formed XML is redundant but may be useful for emphasis.)
Use a tolerant markup parser to cleanup the problem ahead of parsing as XML:
Standalone: xmlstarlet has robust recovering and repair capabilities credit: RomanPerekhrest
xmlstarlet fo -o -R -H -D bad.xml 2>/dev/null
Standalone and C/C++: HTML Tidy works with XML too. Taggle is a port of TagSoup to C++.
Python: Beautiful Soup is Python-based. See notes in the Differences between parsers section. See also answers to this question for more
suggestions for dealing with not-well-formed markup in Python,
including especially lxml's recover=True option.
See also this answer for how to use codecs.EncodedFile() to cleanup illegal characters.
Java: TagSoup and JSoup focus on HTML. FilterInputStream can be used for preprocessing cleanup.
.NET:
XmlReaderSettings.CheckCharacters can
be disabled to get past illegal XML character problems.
#jdweng notes that XmlReaderSettings.ConformanceLevel can be set to
ConformanceLevel.Fragment so that XmlReader can read XML Well-Formed Parsed Entities lacking a root element.
#jdweng also reports that XmlReader.ReadToFollowing() can sometimes
be used to work-around XML syntactical issues, but note
rule-breaking warning in #3 below.
Microsoft.Language.Xml.XMLParser is said to be “error-tolerant”.
Go: Set Decoder.Strict to false as shown in this example by #chuckx.
PHP: See DOMDocument::$recover and libxml_use_internal_errors(true). See nice example here.
Ruby: Nokogiri supports “Gentle Well-Formedness”.
R: See htmlTreeParse() for fault-tolerant markup parsing in R.
Perl: See XML::Liberal, a "super liberal XML parser that parses broken XML."
Process the data as text manually using a text editor or
programmatically using character/string functions. Doing this
programmatically can range from tricky to impossible as
what appears to be
predictable often is not -- rule breaking is rarely bound by rules.
For invalid character errors, use regex to remove/replace invalid characters:
PHP: preg_replace('/[^\x{0009}\x{000a}\x{000d}\x{0020}-\x{D7FF}\x{E000}-\x{FFFD}]+/u', ' ', $s);
Ruby: string.tr("^\u{0009}\u{000a}\u{000d}\u{0020}-\u{D7FF}\u{E000}-\u{FFFD}", ' ')
JavaScript: inputStr.replace(/[^\x09\x0A\x0D\x20-\xFF\x85\xA0-\uD7FF\uE000-\uFDCF\uFDE0-\uFFFD]/gm, '')
For ampersands, use regex to replace matches with &: credit: blhsin, demo
&(?!(?:#\d+|#x[0-9a-f]+|\w+);)
Note that the above regular expressions won't take comments or CDATA
sections into account.
A standard XML parser will NEVER accept invalid XML, by design.
Your only option is to pre-process the input to remove the "predictably invalid" content, or wrap it in CDATA, prior to parsing it.
The accepted answer is good advice, and contains very useful links.
I'd like to add that this, and many other cases of not-wellformed and/or DTD-invalid XML can be repaired using SGML, the ISO-standardized superset of HTML and XML. In your case, what works is to declare the bogus THIS-IS-PART-OF-DESCRIPTION element as SGML empty element and then use eg. the osx program (part of the OpenSP/OpenJade SGML package) to convert it to XML. For example, if you supply the following to osx
<!DOCTYPE xml [
<!ELEMENT xml - - ANY>
<!ELEMENT description - - ANY>
<!ELEMENT THIS-IS-PART-OF-DESCRIPTION - - EMPTY>
]>
<xml>
<description>blah blah
<THIS-IS-PART-OF-DESCRIPTION>
</description>
</xml>
it will output well-formed XML for further processing with the XML tools of your choice.
Note, however, that your example snippet has another problem in that element names starting with the letters xml or XML or Xml etc. are reserved in XML, and won't be accepted by conforming XML parsers.
IMO these cases should be solved by using JSoup.
Below is a not-really answer for this specific case, but found this on the web (thanks to inuyasha82 on Coderwall). This code bit did inspire me for another similar problem while dealing with malformed XMLs, so I share it here.
Please do not edit what is below, as it is as it on the original website.
The XML format, requires to be valid a unique root element declared in the document.
So for example a valid xml is:
<root>
<element>...</element>
<element>...</element>
</root>
But if you have a document like:
<element>...</element>
<element>...</element>
<element>...</element>
<element>...</element>
This will be considered a malformed XML, so many xml parsers just throw an Exception complaining about no root element. Etc.
In this example there is a solution on how to solve that problem and succesfully parse the malformed xml above.
Basically what we will do is to add programmatically a root element.
So first of all you have to open the resource that contains your "malformed" xml (i. e. a file):
File file = new File(pathtofile);
Then open a FileInputStream:
FileInputStream fis = new FileInputStream(file);
If we try to parse this stream with any XML library at that point we will raise the malformed document Exception.
Now we create a list of InputStream objects with three lements:
A ByteIputStream element that contains the string: <root>
Our FileInputStream
A ByteInputStream with the string: </root>
So the code is:
List<InputStream> streams =
Arrays.asList(
new ByteArrayInputStream("<root>".getBytes()),
fis,
new ByteArrayInputStream("</root>".getBytes()));
Now using a SequenceInputStream, we create a container for the List created above:
InputStream cntr =
new SequenceInputStream(Collections.enumeration(str));
Now we can use any XML Parser library, on the cntr, and it will be parsed without any problem. (Checked with Stax library);
I've been given an existing application to work on which uses ColdFusion as a service. I'm a beginner with ColdFusion, however most of it I've been able to figure out. Now, I've been given a task to edit some existing reports and for some reason I'm having trouble determining where the headers for the columns are located. The .cfc file that the report URL calls looks like this:
<cffunction
name="report"
access="remote"
hint="Generate a Report"
output="true"
roles="view"
>
<cfset report = this.reportGenerate(argumentCollection=arguments)/>
<cfcontent variable="#report#" type="application/pdf" reset="Yes">
</cffunction>
<cffunction
name="reportGenerate"
access="package"
hint="Generate a Report"
output="false"
returntype="binary"
etc.
I have .cfr files in the application but I don't have Report Builder and have no way to edit them. I see where the data is being generated but I can't determine where the column headers are defined (at least in this case). The report URL just calls the .cfc file (a portion of which is shown above). Can anyone clue me in as to where the column headers might be defined?
Thanks so much,
Pete
You should be calling the .CFR reports using the CFREPORT coldfusion tag.
And you need to install ColdFusion Report Builder tool to have a look of the Columns defined.
These Columns are dragged and dropped in the ColdFusion Report Builder tool.
And saved there as .CFR file.
This .CFR file needs to be invoked using the CFREPORT tag.
Query objects from CFC can be passed to the CF report Builder from where these values can be pasted at the tool.
Also parameters can be passed to the .CFR file.
I am creating a web application using Thymeleaf with Spring. For that, I am following following document:
http://www.thymeleaf.org/doc/articles/thvsjsp.html
In this document, following code is being discussed:
https://github.com/thymeleaf/thymeleafexamples-thvsjsp
I am trying to get internationalization with Hindi (Indian language), for that I forked the repo given above and added one property file related to hi_IN (hindi-India), please see below code base on github:
https://github.com/prashantbhardwaj/thymeleafexamples-thvsjsp/tree/locale_change
When I tried, instead of getting hindi in browser, I got questions marks. I tried changing encoding of this property file from ISO-8859-1 to UTF-8, but no benefit. Values, I am expecting for given properties are as follows:
subscription.email=ईमेल
subscription.type=प्रकार
subscription.submit=प्रेषित करें!
subscriptionType.ALL_EMAILS=सभी ईमेल
subscriptionType.DAILY_DIGEST=प्रतिदिन का पत्र
In Messages_hi_IN.properties file, I changed all the values from hindi (which was not hindi but escaped version of hindi) to english so you can copy and paste values which are provided by me.
Could you please help get me desired results? Locale for Hindi is hi_IN.