Removing consecutive numbers from a sequence in XQuery - functional-programming

XQuery
Input: (1,2,3,4,5,6,7,14,15,16,17,24,25,26,27,28)
Output: (1,7,14,17,24,28)
I tried to remove consecutive numbers from the input sequence using the XQuery functions but failed doing so
xquery version "1.0" encoding "utf-8";
declare namespace ns1="http://www.somenamespace.org/types";
declare variable $request as xs:integer* external;
declare function local:func($reqSequence as xs:integer*) as xs:integer* {
let $nonRepeatSeq := for $count in (1 to count($reqSequence)) return
if ($reqSequence[$count+1] - $reqSequence) then
remove($reqSequence,$count+1)
else ()
return
$nonRepeatSeq
};
local:func((1,2,3,4,5,6,7,14,15,16,17,24,25,26,27,28))
Please suggest how to do so in XQuery functional language.

Two simple ways to do this in XQuery. Both rely on being able to assign the sequence of values to a variable, so that we can look at pairs of individual members of it when we need to.
First, just iterate over the values and select (a) the first value, (b) any value which is not one greater than its predecessor, and (c) any value which is not one less than its successor. [OP points out that the last value also needs to be included; left as an exercise for the reader. Or see Michael Kay's answer, which provides a terser formulation of the filter; DeMorgan's Law strikes again!]
let $vseq := (1,2,3,4,5,6,7,14,15,16,17,24,25,26,27,28)
for $v at $pos in $vseq
return if ($pos eq 1
or $vseq[$pos - 1] ne $v - 1
or $vseq[$pos + 1] ne $v + 1)
then $v
else ()
Or, second, do roughly the same thing in a filter expression:
let $vseq := (1,2,3,4,5,6,7,14,15,16,17,24,25,26,27,28)
return $vseq[
for $i in position() return
$i eq 1
or . ne $vseq[$i - 1] + 1
or . ne $vseq[$i + 1] - 1]
The primary difference between these two ways of performing the calculation and your non-working attempt is that they don't say anything about changing or modifying the sequence; they simply specify a new sequence. By using a filter expression, the second formulation makes explicit that the result will be a subsequence of $vseq; the for expression makes no such guarantee in general (although because for each value it returns either the empty sequence or the value itself, we can see that here too the result will be a subsequence: a copy of $vseq from which some values have been omitted.
Many programmers find it difficult to stop thinking in terms of assignment to variables or modification of data structures, but its worth some effort.
[Addendum] I may be overlooking something, but I don't see a way to express this calculation in pure XPath 2.0, since XPath 2.0 seems not to have any mechanism that can bind a variable like $vseq to a non-singleton sequence of values. (XPath 3.0 has let expressions, so it's not a challenge there. The second formulation above is itself pure XPath 3.0.)

In XSLT this can be done as:
<xsl:for-each-group select="$in" group-adjacent=". - position()">
<xsl:sequence select="current-group()[1], current-group()[last()]"/>
</xsl:for-each-group>
In XQuery 3.0 you can do it with tumbling windows, but I'm too lazy to work out the detail.
An XPath 2.0 solution (assuming the input sequence is in $in) is:
for $i in 1 to count($in)
return $in[$i][not(. eq $in[$i - 1]+1 and . eq $in[$i+1]-1)]

There are several logic and XQuery usage errors in your solution, but the main problem with it is that variables in XQuery are immutable, so you cannot reassign a value to one once assigned. Therefore, it's often easier to think about these types of problems in terms of recursive solutions:
declare function local:non-consec(
$prev as xs:integer?,
$rest as xs:integer*
) as xs:integer*
{
if (empty($rest)) then ()
else
let $curr := head($rest)
let $next := subsequence($rest, 2, 1)
return (
if ($prev eq $curr - 1 and $curr eq $next - 1)
then () (: This number is part of a consecutive sequence :)
else $curr,
local:non-consec(head($rest), tail($rest))
)
};
local:non-consec((), (1,2,3,4,5,6,7,14,15,16,17,24,25,26,27,28))
=>
1
7
14
17
24
28

Related

Counting nr of elements in a file

I am trying to count the number of Harbour elements in an XML file. However, i keep getting the following error:
item expected, sequence found: (element harbour {...}, ...)
The code snippet is the following:
for $harbour in distinct-values(/VOC/voyage/leftpage/harbour)
let $count := count(/VOC/voyage/leftpage/harbour eq $harbour)
return concat($harbour, " ", $count)
Input XML:
<voyage>
<number>4411</number>
<leftpage>
<harbour>Rammekens</harbour>
</leftpage>
</voyage>
<voyage>
<number>4412</number>
<leftpage>
<harbour>Texel</harbour>
</leftpage>
</voyage>
Can someone help me out? How do I iterate over the number of harbours in the XML file instead of trying to use /VOC/voyage/leftpage/harbour?
eq is a value comparison, i.e. used to compare individual items. That is why the errors messages tells you that it is expecting a (single) item, but instead found all the harbour elements. You have to use the general comparison operator =. Also, when you would compare it like that
/VOC/voyage/leftpage/harbour = $harbour
it would always be 1 as it will compare the existence. instead, you want to filter out all harbour items which have an equal text element as child. You can do so using []. All together it will be
for $harbour in distinct-values(/VOC/voyage/leftpage/harbour)
let $count := count(/VOC/voyage/leftpage/harbour[. = $harbour])
return concat($harbour, " ", $count)
Also, if your XQuery processor supports XQuery 3.0 you can also use a group by operator, which in my opinion is nicer to read (and could be faster, but this depends on the implementation):
for $voyage in /VOC/voyage
let $harbour := $voyage/leftpage/harbour
let $harbour-name := $harbour/string()
group by $harbour-name
return $harbour-name || " " || count($harbour)

Recursively wrapping up an element

Say I have an element <x>x</x> and some empty elements (<a/>, <b/>, <c/>), and I want to wrap up the first inside the second one at a time, resulting in <c><b><a><x>x</x></a></b></c>. How do I go about this when I don't know the number of the empty elements?
I can do
xquery version "3.0";
declare function local:wrap-up($inner-element as element(), $outer-elements as element()+) as element()+ {
if (count($outer-elements) eq 3)
then element{node-name($outer-elements[3])}{element{node-name($outer-elements[2])}{element{node-name($outer-elements[1])}{$inner-element}}}
else
if (count($outer-elements) eq 2)
then element{node-name($outer-elements[2])}{element{node-name($outer-elements[1])}{$inner-element}}
else
if (count($outer-elements) eq 1)
then element{node-name($outer-elements[1])}{$inner-element}
else ($outer-elements, $inner-element)
};
let $inner-element := <x>x</x>
let $outer-elements := (<a/>, <b/>, <c/>)
return
local:wrap-up($inner-element, $outer-elements)
but is there a way to do this by recursion, not decending and parsing but ascending and constructing?
In functional programming, you usually try to work with the first element and the tail of a list, so the canonical solution would be to reverse the input before nesting the elements:
declare function local:recursive-wrap-up($elements as element()+) as element() {
let $head := head($elements)
let $tail := tail($elements)
return
element { name($head) } { (
$head/#*,
$head/node(),
if ($tail)
then local:recursive-wrap-up($tail)
else ()
) }
};
let $inner-element := <x>x</x>
let $outer-elements := (<a/>, <b/>, <c/>)
return (
local:wrap-up($inner-element, $outer-elements),
local:recursive-wrap-up(reverse(($inner-element, $outer-elements)))
)
Whether reverse(...) will actually require reversing the output or not will depend on your XQuery engine. In the end, reversing does not increase computational complexity, and might not only result in cleaner code, but even faster execution!
Similar could be achieved by turning everything upside down, but there are no functions for getting the last element and everything before this, and will possibly reduce performance when using predicates last() and position() < last(). You could use XQuery arrays, but will have to pass counters in each recursive function call.
Which solution is fastest in the end will require benchmarking using the specific XQuery engine and code.

Xquery Split a string by the Nth occurrence of a character

I need some help with splitting a long string of characters by the Nth occurrence of a certain character. For example
<string>1,2,3,4,5,6,7,8,9,10,11,12,13,14,15,16,17,18,19,20,21,22,23,24,25,26,27</string>
to be split by the 9th comma
and to become
<string>1,2,3,4,5,6,7,8,9</string>
<string>10,11,12,13,14,15,16,17,18</string>
<string>19,20,21,22,23,24,25,26,27</string>
The length of the original string is not specified and the numbers 1-27 in the example could be words with spaces, but the comma is uniquely a separator.
Thanks!
let $s := <string>1,2,3,4,5,6,7,8,9,10,11,12,13,14,15,16,17,18,19,20,21,22,23,24,25,26,27</string>
let $len := 9
let $tokens := tokenize($s, ',')
for $n in (1 to count($tokens) idiv $len)
return <string>{
string-join(subsequence($tokens, $len * ($n - 1) + 1, $len), ',')
}</string>
For further reference, here is another solution using XQuery 3.0. It does not use regular expression, but instead a tumbling window.
let $s := '1,2,3,4,5,6,7,8,9,10,11,12,13,14,15,16,17,18,19,20,21,22,23,24,25,26,27'
for tumbling window $w in tokenize($s, ',')
start at $start when true()
end at $end when $end - $start eq 8
return <string>{$w}</string>
This looks like the model use case for windows, in my opinion. It is quite nicely readable: Use a tumbling window (in contrast to a sliding window, which slides only one element further in the sequence each turn, a tumbling window never overlaps) and start at the beginning of the sequence. End a window if there are 9 elements in the window (i.e. 8 in between the start and the end).
If you've got access to XQuery 3.0, you can also use analyze-string(...) using some regex foo:
let $string := '1,2,3,4,5,6,7,8,9,10,11,12,13,14,15,16,17,18,19,20,21,22,23,24,25,26,27'
let $result := analyze-string($string, '(?:[^,]+,){8}[^,]+')
return $result/fn:match
Please realize the number of recurrences in the regular expression is one less as the number of values you want to partition after; it resembles the number of values together with commata, and a single value afterwards.
If you also have to deal with the tail, eg. when dividing the string into tuples of 8 numbers:
let $string := '1,2,3,4,5,6,7,8,9,10,11,12,13,14,15,16,17,18,19,20,21,22,23,24,25,26,27'
let $result := analyze-string($string, '(?:[^,]+,){7}[^,]+')
return $result/(fn:match/string(), *[last()]/substring(., 2))

Updating counter in XQuery

I want to create a counter in xquery. My initial attempt looked like the following:
let $count := 0
for $prod in $collection
let $count := $count + 1
return
<counter>{$count }</counter>
Expected result:
<counter>1</counter>
<counter>2</counter>
<counter>3</counter>
Actual result:
<counter>1</counter>
<counter>1</counter>
<counter>1</counter>
The $count variable either failing to update or being reset. Why can't I reassign an existing variable? What would be a better way to get the desired result?
Try using 'at':
for $d at $p in $collection
return
element counter { $p }
This will give you the position of each '$d'. If you want to use this together with the order by clause, this won't work since the position is based on the initial order, not on the sort result. To overcome this, just save the sorted result of the FLWOR expression in a variable, and use the at clause in a second FLWOR that just iterates over the first, sorted result.
let $sortResult := for $item in $collection
order by $item/id
return $item
for $sortItem at $position in $sortResult
return <item position="{$position}"> ... </item>
As #Ranon said, all XQuery values are immutable, so you can't update a variable. But if you you really need an updateable number (shouldn't be too often), you can use recursion:
declare function local:loop($seq, $count) {
if(empty($seq)) then ()
else
let $prod := $seq[1],
$count := $count + 1
return (
<count>{ $count }</count>,
local:loop($seq[position() > 1], $count)
)
};
local:loop($collection, 0)
This behaves exactly as you intended with your example.
In XQuery 3.0 a more general version of this function is even defined in the standard library: fn:fold-right($f, $zero, $seq)
That said, in your example you should definitely use at $count as shown by #tohuwawohu.
Immutable variables
XQuery is a functional programming language, which involves amongst others immutable variables, so you cannot change the value of a variable. On the other hand, a powerful collection of functions is available to you, which solves lots of daily programming problems.
let $count := 0
for $prod in $collection]
let $count := $count + 1
return
<counter>{$count }</counter>
let $count in line 1 defines this variable in all scope, which are all following lines in this case. let $count in line 3 defines a new $count which is 0+1, valid in all following lines within this code block - which isn't defined. So you indeed increment $count three times by one, but discard the result immediatly.
BaseX' query info shows the optimized version of this query which is
for $prod in $collection
return element { "counter" } { 1 }
The solution
To get the total number of elements in $collection, you can just use
return count($collection)
For a list of XQuery functions, you could have a look at the XQuery part of functx which contains both a list of XQuery functions and also some other helpful functions which can be included as a module.
Specific to MarkLogic you can also use xdmp:set. But this breaks functional language assumptions, so use it conservatively.
http://docs.marklogic.com/5.0doc/docapp.xqy#display.xqy?fname=http://pubs/5.0doc/apidoc/ExsltBuiltins.xml&category=Extension&function=xdmp:set
For an example of xdmp:set in real-world code, the search parser https://github.com/mblakele/xqysp/blob/master/src/xqysp.xqy might be helpful.
All the solution above are valid but I would like to mention that you can use the XQuery Scripting extension to set variable values:
variable $count := 0;
for $prod in (1 to 10)
return {
$count := $count + 1;
<counter>{$count}</counter>
}
You can try this example live at http://www.zorba-xquery.com/html/demo#twh+3sJfRpHhZR8pHhOdsmqOTvQ=
Use xdmp:set instead of the below query
let $count := 0
for $prod in (1 to 4)
return ( xdmp:set($count,number($count+1)) ,<counter>{$count }</counter>
I think you are looking for something like:
XQUERY:
for $x in (1 to 10)
return
<counter>{$x}</counter>
OUTPUT:
<counter>1</counter>
<counter>2</counter>
<counter>3</counter>
<counter>4</counter>
<counter>5</counter>
<counter>6</counter>
<counter>7</counter>
<counter>8</counter>
<counter>9</counter>
<counter>10</counter>

XQuery - problem with recursive function

Im new on this project and am going to write, what i thought was a simple thing. A recursive function that writes nested xml elements in x levels (denoted by a variable). So far I have come up with this, but keeps getting a compile error. Please note that i have to generate new xml , not query existing xml:
xquery version "1.0";
declare function local:PrintTest($amount)
{
<test>
{
let $counter := 0
if ($counter <= $amount )
then local:PrintTest($counter)
else return
$counter := $counter +1
}
</test>
};
local:PrintPerson(3)
My error is:
File Untitled1.xquery: XQuery transformation failed
XQuery Execution Error!
Unexpected token - " ($counter <= $amount ) t"
I never understood xquery, and cant quite see why this is not working (is it just me or are there amazingly few resources on the Internet concerning XQuery?)
You have written this function in a procedural manner, XQuery is a functional language.
Each function body can only be a single expression; it looks like you are trying to write statements (which do not exist in XQuery).
Firstly, your let expression must be followed by a return keyword.
return is only used as part of a FLWOR expression, a function always evaluates to a value. As you have written it return is equivalent to /return and so will return a node called return.
The line $counter := $counter + 1 is not valid XQuery at all. You can only set a variable like this with a let expression, and in this case it would create a new variable called counter which replaced the old one, that would be in scope only in the return expression of the variable.
The correct way to do what you are trying to do is to reduce the value of $argument each time the function recurses, and stop when you hit 0.
declare function local:Test($amount)
{
if ($amount == 0)
then ()
else
<test>
{
local:Test($amount - 1)
}
</test>
};
local:Test(3)
Note that I have changed the name of the function to Test. The name "PrintTest" was misleading, as this implies that the function does something (namely, printing). The function in fact just returns a node, it does not do any printing. In a purely functional langauge (which XQuery is quite close to) a function never has any side effects, it merely returns a value (or in this case a node).
The line $counter := $counter + 1 is valid XQuery Scripting.

Resources