Returning search results using xquery in marklogic - xquery

I'm trying to search for a term using XQuery in MarkLogic. When I run the following code I get nothing. Does anyone know what the problem is? I don't use a namespace, but I don't think that is the problem.
Note: when I run this query in Query Console in MarkLogic I get this message:
your query returned an empty sequence
Code:
{
for $article-title in /article/front/article-meta/title-group/article-title[cts:contains(., xdmp:get-request-field("term"))]
let $article-title := fn:tokenize(fn:string($article-title/article/front/article-meta/title-group/article-title), " ")[1 to 100]
let $journal-title := $article-title/article/front/journal-meta/journal-title-group/journal-title/text()
let $contrib := $article-title/article/front/article-meta/contrib-group/contrib/text()
let $year:= $article-title/article/front/article-meta/pub-date/text()
let $sec-title:= $article-title/article/body/section/sec-title/text()
return (
<tr>
<td colspan="10"><hr/></td>
</tr>,
<tr>
<td><b>{$article-title}</b></td>
<td><b>{$journal-title}</b></td>
<td>{$contrib}</td>
<td>{$year}</td>
<td>{$sec-title}</td>
</tr>,
<tr>
<td colspan="10" class="article-title">{$article-title} ...</td>
</tr>
)
}
XML sample:
<?xml version="1.0" encoding="UTF-8"?>
<article article-type="article" xml:lang="en" structure-type="article" dtd- version="1.0" xmlns:xlink="http://www.w3.org/1999/xlink" xmlns:mml="http://www.w3.org/1998/Math/MathML" xmlns:xsi="http://www.w3.org/2001/XMLSchema-instance">
<front>
<journal-meta>
<journal-title-group>
<journal-title xml:lang="en">Psychology of Addictive Behaviors</journal-title>
</journal-title-group>
<issn pub-type="print">0893-164X</issn>
<issn pub-type="online">1939-1501</issn>
<publisher>
<publisher-name>American Psychological Association</publisher-name>
</publisher>
</journal-meta>
<article-meta>
<article-id pub-id-type="apaID">adb_21_4_462</article-id>
<article-id pub-id-type="doi">10.1037/0893-164X.21.4.462</article-id>
<article-id pub-id-type="pi-uid">2007-18113-004</article-id>
<article-categories>
<subj-group subj-group-type="toc-heading">
<subject>Articles</subject>
</subj-group>
</article-categories>
<title-group>
<article-title>Social Dominance Mediates the Association of Testosterone and Neurobehavioral Disinhibition With Risk for Substance Use Disorder</article-title>
</title-group>
<contrib-group content-type="journal-editors">
<contrib contrib-type="editor" corresp="no" xlink:type="simple">
<string-name>
<given-names>Howard J.</given-names> <surname>Shaffer</surname>
</string-name>
<role>Editor</role>
</contrib>
</contrib-group>
<contrib-group content-type="primary-authors">
<contrib contrib-type="author" corresp="yes" rid="aff1 corr1" xlink:type="simple">
<string-name>
<given-names>Ralph E.</given-names> <surname>Tarter</surname>
</string-name>
</contrib>
<contrib contrib-type="author" corresp="no" rid="aff1" xlink:type="simple">
<string-name>
<given-names>Levent</given-names> <surname>Kirisci</surname>
</string-name>
</contrib>
<contrib contrib-type="author" corresp="no" rid="aff1" xlink:type="simple">
<string-name>
<given-names>Galina P.</given-names> <surname>Kirillova</surname>
</string-name>
</contrib>
<contrib contrib-type="author" corresp="no" rid="aff1" xlink:type="simple">
<string-name>
<given-names>Judy</given-names> <surname>Gavaler</surname>
</string-name>
</contrib>
<contrib contrib-type="author" corresp="no" rid="aff2" xlink:type="simple">
<string-name>
<given-names>Peter</given-names> <surname>Giancola</surname>
</string-name>
</contrib>
</contrib-group>
</article-meta>
</front>
</article>

I think the meta-question here is: how do you debug a complex query when running the query returns nothing (an empty sequence)? It's not very useful for us to debug the query for you, it's much more useful that you should know how to debug it yourself.
If you've got a schema for the source, then running it as a schema-aware query can be very useful, even if you only do this temporarily for debugging purposes. A schema-aware query processor will check your path expressions against the schema, and tell you if you are trying to select elements or paths that, according to the schema, can never exist.
After that it's a process of logical deduction, and/or experimentation to distill the query to its essence. Because you've only got one "for" clause, and the return clause always produces something, the only way of getting an empty sequence as the result is if the for clause selects nothing. So that reduces it to a problem with the expression
/article/front/article-meta/title-group/article-title
[cts:contains(., xdmp:get-request-field("term"))]
At this stage using an IDE like oXygen can really help: put your source document into the editor, open the XPath evaluator, and enter this path. You'll need to modify it, because it uses MarkLogic extension functions. But you can start by eliminating the predicate and seeing if the path selects anything. I attempted that, but unfortunately your XML isn't well-formed so I gave up. But it's not difficult to do yourself. If the path expression selects nothing, remove trailing steps from the path until you get a result: the last step that your removed is the one that's wrong.

First, there are some errors in your XPath. You were selecting an article-title element but treating it as an article element in the following XPath. Next, you reassigned the $article-title variable (which is not actually possible in most XQuery processors - MarkLogic is an exception) to a string, and then executed XPath on that as if it were a node. Then for the remaining variable assignments, you were both operating on a string as if it were a node AND treating the variable as an article, when it would have been an article-title.
I updated the query by changing the for variable assignment to article and moving the rest of the XPath into a predicate. Then the other variables were updated to query from the $article variable instead of $article-title, which is a string.
for $article in /article[front/article-meta/title-group/article-title/cts:contains(., xdmp:get-request-field("term"))]
let $article-title := fn:tokenize(fn:string($article/front/article-meta/title-group/article-title), " ")[1 to 100]
let $journal-title := $article/front/journal-meta/journal-title-group/journal-title/text()
let $contrib := $article/front/article-meta/contrib-group/contrib/text()
let $year:= $article/front/article-meta/pub-date/text()
let $sec-title:= $article/body/section/sec-title/text()
There are a couple other possibilities I would check, if you continue to have problems: 1) Be sure your call to xdmp:get-request-field() is actually returning the value you expect; 2) Database index settings affect the behavior of cts:contains, so if any of the elements in the path you are selecting are excluded from the index, then cts:contains will treat it as if it doesn't exist.

Related

XSLT-style mini transformation in Xquery?

At the moment in Xquery 3.1 (in eXist 4.7) I receive XML fragments that look like the following (from eXist's Lucene full text search):
let $text :=
<tei:text>
<front>
<tei:div>
<tei:listBibl>
<tei:bibl>There is some</tei:bibl>
<tei:bibl>text in certain elements</tei:bibl>
</tei:listBibl>
</tei:div>
<tei:div>
<tei:listBibl>
<tei:bibl>which are subject <exist:match>to</exist:match> a Lucene search</tei:bibl>
<tei:bibl></tei:bibl>
<tei:listBibl>
</tei:div>
<tei:front>
<tei:body>
<tei:p>and often produces</tei:p>
<tei:p>a hit.</tei:p>
<tei:body>
<tei:text>
Currently I have Xquery send this fragment to an XSLT stylesheet in order to transform it into HTML like this:
<td>...elements which are subject <span class="search-hit">to</span> a Lucene search and often p...
Where the stylesheet's job is to return 30 characters of text before and after <exist:match/> and put the content of <exist:match/> into a span. There is only one <exist:match/> per transformation.
This all works fine. However, it's occurred to me that it is a very small job with effectively a single transformation of only one element, the rest being a sort of string-join. I therefore wonder if this can't be done efficiently in Xquery.
In trying to do this, I'm can't seem to find a way to handle the string content up to the <exist:match/> and then the string content after <exist:match/>. My idea is, in pseudo code, to output a result like:
let $textbefore := some function to get the text before <exist:match/>
let $textafter := some function to get text before <exist:match/>
return <td>...{$textbefore}
<span class="search-hit">
{$text//exist:match/text()}
</span> {$textafter}...</td>
Is this even worth doing in Xquery vs the current Xquery -> XSLT pipeline I have?
Many thanks.
I think it can be done as
declare namespace output = "http://www.w3.org/2010/xslt-xquery-serialization";
declare namespace tei = "http://example.com/tei";
declare namespace exist = "http://example.com/exist";
declare option output:method 'html';
let $text :=
<tei:text>
<tei:front>
<tei:div>
<tei:listBibl>
<tei:bibl>There is some</tei:bibl>
<tei:bibl>text in certain elements</tei:bibl>
</tei:listBibl>
</tei:div>
<tei:div>
<tei:listBibl>
<tei:bibl>which are subject <exist:match>to</exist:match> a Lucene search</tei:bibl>
<tei:bibl></tei:bibl>
</tei:listBibl>
</tei:div>
</tei:front>
<tei:body>
<tei:p>and often produces</tei:p>
<tei:p>a hit.</tei:p>
</tei:body>
</tei:text>
,
$match := $text//exist:match,
$text-before-all := normalize-space(string-join($match/preceding::text(), ' ')),
$text-before := substring($text-before-all, string-length($text-before-all) - 30),
$text-after := substring(normalize-space(string-join($match/following::text(), ' ')), 1, 30)
return
<td>...{$text-before}
<span class="search-hit">
{$match/text()}
</span> {$text-after}...</td>
which is not really much of a query in XQuery either but just some XPath selection plus some possibly expensive string joining and extraction on the preceding and following axis.

How to add a value to the existing element value and return it as a new value

This is the xml file.
<?xml version="1.0" encoding="UTF-8"?>
<root>
<AtcoCode> System-Start-Date= 2018-05-16T12:35:48.6929328-04:00, " ", System-End-Date = 9999-12-31, " ", 150042010003</AtcoCode>
<NaptanCode>esxatgjd</NaptanCode>
<PlateCode>
</PlateCode>
<CleardownCode>
</CleardownCode>
<CommonName>Upper Park</CommonName>
<CommonNameLang>
</CommonNameLang>
<ShortCommonName>
</ShortCommonName>
<ShortCommonNameLang>
</ShortCommonNameLang>
<Landmark>Upper Park</Landmark>
<LandmarkLang>
</LandmarkLang>
<Street>High Road</Street>
<StreetLang>
</StreetLang>
<Crossing>
</Crossing>
<CrossingLang>
</CrossingLang>
<Indicator>adj</Indicator>
<IndicatorLang>
</IndicatorLang>
<Bearing>NE</Bearing>
<NptgLocalityCode>E0046286</NptgLocalityCode>
<LocalityName>Loughton</LocalityName>
<ParentLocalityName>
</ParentLocalityName>
<GrandParentLocalityName>
</GrandParentLocalityName>
<Town>Loughton</Town>
<TownLang>
</TownLang>
<Suburb>
</Suburb>
<SuburbLang>
</SuburbLang>
<LocalityCentre>1</LocalityCentre>
<GridType>U</GridType>
<Easting>541906</Easting>
<Northing>195737</Northing>
<Co-ordinates>51.64255,0.04944</Co-ordinates>
<StopType>BCT</StopType>
<BusStopType>MKD</BusStopType>
<TimingStatus>OTH</TimingStatus>
<DefaultWaitTime>
</DefaultWaitTime>
<Notes>
</Notes>
<NotesLang>
</NotesLang>
<AdministrativeAreaCode>080</AdministrativeAreaCode>
<CreationDateTime>2006-11-06T00:00:00</CreationDateTime>
<ModificationDateTime>2010-01-16T07:58:02</ModificationDateTime>
<RevisionNumber>5</RevisionNumber>
<Modification>rev</Modification>
<Status>act</Status>
</root>
How to achieve this?
Question: Create the path range index for the status element and fetch all the documents that has status del
after fetching all the documents, you need to create the new element called currentreservationnumber under RevisionNumber element.
The value of the currentrevisionnumber will be +1 to the RevisionNumber.
I think the warning about sequential numbers is related to system-wide unique numbers/ids (like Oracle sequence), so not a worry in this case?
If you only ever have one RevisionNumber, and you can find it without a path index, you can maybe get by with element-value query on the RevisionNumber since it's already indexed.
Given that you get the document somehow, it could be as simple as:
let $doc := fn:doc ('/foo.xml')
let $rev-node := $doc/root/RevisionNumber
return xdmp:node-insert-after ($rev-node, <currentreservationnumber>{$rev-node + 1}</currentreservationnumber>)
though remember to consider locking if you are doing a big query/update. And you might need to switch to node-replace if there is already a currentreservationnumber.

How can I find the duration of a song when given its ID using XQuery?

First off, yes this is homework - please suggest where I am going wrong,but please do not do my homework for me.
I am learning XQuery, and one of my tasks is to take a list of song ID's for a performance and determine the total duration of the performance. Given the snippits below, can anyone point me to where I can determine how to cross reference the songID from the performance to the duration of the song?
I've listed my attempts at the end of the question.
my current XQuery code looks like:
let $songIDs := doc("C:/Users/rob/Downloads/A4_FLOWR.xml")
//SongSet/Song
for $performance in doc("C:/Users/rob/Downloads/A4_FLOWR.xml")
//ContestantSet/Contestant/Performance
return if($performance/SongRef[. =$songIDs/#SongID])
then <performanceDuration>{
data($performance/SongRef)
}</performanceDuration>
else ()
Which outputs:
<performanceDuration>S005 S003 S004</performanceDuration>
<performanceDuration>S001 S007 S002</performanceDuration>
<performanceDuration>S008 S009 S006</performanceDuration>
<performanceDuration>S002 S004 S007</performanceDuration>
Each S00x is the ID of a song, which us found in the referenced xml document (partial document):
<SongSet>
<Song SongID="S001">
<Title>Bah Bah Black Sheep</Title>
<Composer>Mother Goose</Composer>
<Duration>2.99</Duration>
</Song>
<Song SongID="S005">
<Title>Thank You Baby</Title>
<Composer>Shania Twain</Composer>
<Duration>3.02</Duration>
</Song>
</SongSet>
The performance section looks like:
<Contestant Name="Fletcher Gee" Hometown="Toronto">
<Repertoire>
<SongRef>S001</SongRef>
<SongRef>S002</SongRef>
<SongRef>S007</SongRef>
<SongRef>S010</SongRef>
</Repertoire>
<Performance>
<SongRef>S001</SongRef>
<SongRef>S007</SongRef>
<SongRef>S002</SongRef>
</Performance>
</Contestant>
My Attempts
I thought I would use nested loops, but that fails:
let $songs := doc("C:/Users/rob/Downloads/A4_FLOWR.xml")
//SongSet/Song
for $performance in doc("C:/Users/rob/Downloads/A4_FLOWR.xml")
//ContestantSet/Contestant/Performance
return if($performance/SongRef[. =$songs/#SongID])
for $song in $songIDs
(: gives an error in BaseX about incomplete if :)
then <performanceDuration>{
data($performance/SongRef)
}</performanceDuration>
else ()
--Edit--
I've fixed the inner loop, however I am getting all the songs durations, not just the ones that match id's. I have a feeling that this is due to variable scope, but I'm not sure:
let $songs := doc("C:/Users/rob/Downloads/A4_FLOWR.xml")//SongSet/Song
for $performance in doc("C:/Users/rob/Downloads/A4_FLOWR.xml")//ContestantSet/Contestant/Performance
return if($performance/SongRef[. =$songs/#SongID])
then <performanceDuration>{
for $song in $songs
return if($performance/SongRef[. =$songs/#SongID])
then
sum($song/Duration)
else ()
}</performanceDuration>
else ()
}
Output:
<performanceDuration>2.99 1.15 3.15 2.2 3.02 2.25 3.45 1.29 2.33 3.1</performanceDuration>
Your immediate problem is syntactic: you've inserted your inner loop between the condition and the keyword 'then' in a conditional. Fix that first:
return if ($performance/SongRef = $songs/#SongID) then
<performanceDuration>{
(: put your inner loop HERE :)
}</performanceDuration>
else ()
Now think yourself into the situation of the query evaluator inside the performanceDuration element. You have the variable $performance, you can find all the song references using $performance/SongRef, and for each song reference in the performance element, you can find the corresponding song element by matching the SongRef value with $songs/#SongID.
My next step at this point would be to ask myself:
For a given song reference, how do I find the song element for that song, and then the duration for that song?
Is there a way to get the sum of some set of durations? Is there, for example, a sum() function? (I'm pretty sure there is, but at this point I always pull up the Functions and Operators spec and look it up to be sure of the signature.)
What type does the duration info have? I'd expect it to be minutes and seconds, and I'd be worrying about duration arithmetic, but your sample makes it look like decimals, which will be easy.

MarkLogic Join Query

Hi I am new to marklogic and in Xquery world. I am not able to think of starting point to write the following logic in Marklogic Xquery. I would be thankful if somebody can give me idea/sample so I can achieve the following:
I want to Query A.XML based on a word lookup in B.XML. Query should produce C.XML. The logic should be as follows:
A.XML
<root>
<content> The state passed its first ban on using a handheld cellphone while driving in 2004 Nokia Vodafone Nokia Growth Recession Creicket HBO</content>
</root>
B.XML
<WordLookUp>
<companies>
<company name="Vodafone">Vodafone</company>
<company name="Nokia">Nokia</company>
</companies>
<topics>
<topic group="Sports">Cricket</topic>
<topic group="Entertainment">HBO</topic>
<topic group="Finance">GDP</topic>
</topics>
<moods>
<mood number="4">Growth</mood>
<mood number="-5">Depression</mood>
<mood number="-3">Recession</mood>
</moods>
C.XML (Result XML)
<root>
<content> The state passed its first ban on using a handheld cellphone while driving in 2004 Nokia Vodafone Nokia Growth Recession Creicket HBO</content>
<updatedElement>
<companies>
<company count="1">Vodafone</company>
<company count="2">Nokia</company>
</companies>
<mood>1</mood>
<topics>
<topic count="1">Sports</topic>
<topic count="1">Entertainment</topic>
</topics>
<word-count>22</word-count>
</updatedElement>
</root>
Search each company/text() of A.xml in B.xml, if match found create tag:
TAG {company count="Number of occurrence of that word"}company/#name
{/company}
Search each topic/text() of A.xml in B.xml, if match found create tag
TAG {topic topic="Number of occurrences of that word"}topic/#group{/topic}
Search each mood/text() of A.xml in B.xml, if match found
[occurrences of first word * {/mood[first word]/#number}] + [occurrences of second word * {/mood[second word]/#number})]....
get the word count of element.
This was a fun one, and I learned a few things in the process. Thanks!
Note: to get the results you wanted, I fixed a typo in A.xml ("Creicket" -> "Cricket").
The following solution uses two MarkLogic-specific functions:
cts:highlight (for replacing matching text with nodes which you can then count)
cts:tokenize (for breaking up a given string into word, space, and punctuation parts)
It also includes some powerful magic specific to those two functions, respectively:
the dynamic binding of the special variable $cts:text (which isn't really necessary for this particular use case, but I digress), and
the data model extension which adds these subtypes of xs:string:
cts:word,
cts:space, and
cts:punctuation.
Enjoy!
xquery version "1.0-ml";
(: Generic function using MarkLogic's ability to find query matches within a single node :)
declare function local:find-matches($content, $search-text) {
cts:highlight($content, $search-text, <MATCH>{$cts:text}</MATCH>)
//MATCH
};
(: Generic function using MarkLogic's ability to tokenize text into words, punctuation, and spaces :)
declare function local:get-words($text) {
cts:tokenize($text)[. instance of cts:word]
};
(: The rest of this is pure XQuery :)
let $content := doc("A.xml")/root/content,
$lookup := doc("B.xml")/WordLookUp
return
<root>
{$content}
<updatedElement>
<companies>{
for $company in $lookup/companies/company
let $results := local:find-matches($content, string($company))
where exists($results)
return
<company count="{count($results)}">{string($company/#name)}</company>
}</companies>
<mood>{
sum(
for $mood in $lookup/moods/mood
let $results := local:find-matches($content, string($mood))
return count($results) * $mood/#number
)
}</mood>
<topics>{
for $topic in $lookup/topics/topic
let $results := local:find-matches($content, string($topic))
where exists($results)
return
<topic count="{count($results)}">{string($topic/#group)}</topic>
}</topics>
<word-count>{
count(local:get-words($content))
}</word-count>
</updatedElement>
</root>
Let me know if you have any follow-up questions about how all the above works. At first, I was inclined to use cts:search or cts:contains, which are the bread and butter for search in MarkLogic. But I realized that this example wasn't so much about search (finding documents) as it was about looking up matching text within an already-given document. If you needed to extend this somehow to aggregate across a large number of documents, then you'd want to look into the additional use of cts:search or cts:contains.
One final caveat: if you think your content might have <MATCH> elements already, you'll want to use a different element name when calling cts:highlight (a name which you can guarantee won't conflict with your content's existing element names). Otherwise, you'll potentially get the wrong number of results (higher than the accurate count).
ADDENDUM:
I was curious if this could be done without cts:highlight, given that cts:tokenize already breaks up the text into all the words for you. The same result is produced using this alternative implementation of local:find-matches (provided you swap the order of the function declarations because one depends on the other):
(: Find word matches by comparing them one-by-one :)
declare function local:find-matches($content, $search-text) {
local:get-words($content)[cts:stem(.) = cts:stem($search-text)]
};
It uses cts:stem to normalize the given word to its stem, so, for example searching for "pass" will match "passed", etc. However, this still won't work for multi-word (phrase) searches. So to be safe, I'd stick with using cts:highlight, which, like cts:search and cts:contains, can handle any cts:query you give it (including simple word/phrase searches like we do above).
Might make sense to step back and ask if you might be better served modeling your data and or documents for use with a document oriented database instead of an rdbms
This is simpler/shorter and fully compliant XQuery not containing any implementation extensions, which make it work with any compliant XQuery 1.0 processor:
let $content := doc('file:///c:/temp/delete/A.xml')/*/*,
$lookup := doc('file:///c:/temp/delete/B.xml')/*,
$words := tokenize($content, '\W+')[.]
return
<root>
{$content}
<updatedElement>
<companies>
{for $c in $lookup/companies/*,
$occurs in count(index-of($words, $c))
return
if($occurs)
then
<company count="{$occurs}">
{$c/text()}
</company>
else ()
}
</companies>
<mood>
{
sum($lookup/moods/*[false or index-of($words, data(.))]/#number)
}
</mood>
<topics>
{for $t in $lookup/topics/*,
$occurs in count(index-of($words, $t))
return
if($occurs)
then
<topic count="{$occurs}">
{data($t/#group)}
</topic>
else ()
}
</topics>
<word-count>{count($words)}</word-count>
</updatedElement>
</root>
When applied on the provided files A.xml and B.XML (contained in the local directory c:/temp/delete), the wanted, correct result is produced:
<root>
<content> The state passed its first ban on using a handheld cellphone while driving in 2004 Nokia Vodafone Nokia Growth Recession Cricket HBO</content>
<updatedElement>
<companies>
<company count="1">Vodafone</company>
<company count="2">Nokia</company>
</companies>
<mood>1</mood>
<topics>
<topic count="1">Sports</topic>
<topic count="1">Entertainment</topic>
</topics>
<word-count>22</word-count>
</updatedElement>
</root>

Peculiar error with ColdFusion on BlueDragon.NET

We've got an odd issue occurring with ColdFusion on BlueDragon.NET. Asking here because of the broad experience of StackOverflow users.
Tags inside POSTed content to out BlueDragon.NET server gets removed, and we're not sure where in the stack it's getting removed. So for example if we post this data
[CORE]
Lesson_Status=Incomplete
Lesson_Location=comm_13_a02_bs_enus_t17s06v01
score=
time=00:00:56
[Core_Lesson]
<sd ac="" pc="7.0" at="1289834380459" ct="" ><t id="lo8" sc=";;" st="c" /></sd>
<sd ac='' pc='7.0' at='1289834380459' ct='' ><t id='lo8' sc=';;' st='c' /></sd>
<sd ac="" pc="7.0" at="1289834380459" ct="" ><t id="lo8" sc=";;" st="c" /></sd>
<sd ac="" pc="7.0" at="1289834380459" ct="" ><t id="lo8" sc=";;" st="c" /></sd>
<b>hello1</b>
<i>hello2</i>
<table border><td>hello3</td></table>
<sd>hello4</sd>
<sd ac="1">hello5</sd>
<t>hello6</t>
<t />
<t attr="hello8" />
<strong>hello10</strong>
<img>
><>
What we get back is this:
[CORE]
Lesson_Status=Incomplete
Lesson_Location=comm_13_a02_bs_enus_t17s06v01
score=
time=00:00:56
[Core_Lesson]
hello1
hello2
hello3
hello4
hello5
hello6
hello10
>
That is, anything that starts with < and ends with > is getting stripped or filtered and no longer appears in ColdFusion's FORM scope when it's posted.
Our server with BlueDragon JX does not suffer this problem.
If we bypass using the default FORM scope and use this code, the tag-like content appears:
<cfscript>
// get the content string of the raw HTTP headers, will include all POST content as a long querystring
RAWREQUEST = GetHttpRequestData();
// split the string on "&" character, each variable should now be separate
// note that at this point duplicate variables will get clobbered
RAWFORMFIELDS = ListToArray(RAWREQUEST.content, "&");
// We're creating a structure like "FORM", but better
BetterFORM = StructNew();
// Go over each of the raw form fields, take the key
// and add it as a key, and decode the value into the value field
// and trap the whole thing if for some reason garbage gets in there
for(i=1;i LTE ArrayLen(RAWFORMFIELDS);i = i + 1) {
temp = ListToArray(RAWFORMFIELDS[i], "=");
try {
tempkey = temp[1];
tempval = URLDecode(temp[2]);
StructInsert(BetterFORM, tempkey, tempval);
} catch(Any e) {
tempThisError = "Malformed Data: " & RAWFORMFIELDS[i];
// Log the value of tempThisError here?
// WriteOutput(tempThisError);
}
}
</cfscript>
<cfdump var="#BetterFORM#">
If we do this, and use the created BetterFORM variable, it's there, so it does not seem to be a problem with the requests being filtered at some other point in the stack. I was thinking maybe it was URLScan, but that appears not to be installed. Since BD.NET runs on .NET as the engine, perhaps there's some sanitization setting that is being used on all variables somehow?
Suggestions, ideas, etc are welcome on this issue.
I don't have a BD.NET instance handy to check, but Adobe ColdFusion has a setting in the cf administrator to strip "invalid tags". That's my best guess. Adobe CF replaces them with "invalidTag", my guess is that BD.Net just strips it silently.
It turned out to be very mundane.
We had a custom tag that did customized string replacements. On one server, it was modified to NOT replace all tags. On this server, we were using an older version that did. So the fault was not a difference between BlueDragon JX and BlueDragon.NET -- it was developer team error.

Resources