Get the most repeated element in a sequence with XQuery - xquery

I've got a sequence of values. They can all be equal... or not. So with XQuery I want to get the most frequent item in the sequence.
let $counter := 0, $index1 := 0
for $value in $sequence
if (count(index-of($value, $sequence)))
then
{
$counter := count(index-of($value, $sequence)) $index1 := index-of($value)
} else {}
I can't make this work, so I suppose I'm doing something wrong.
Thanks in advance for any help you could give me.

Use:
for $maxFreq in
max(for $val in distinct-values($sequence)
return count(index-of($sequence, $val))
)
return
distinct-values($sequence)[count(index-of($sequence, .)) eq $maxFreq]
Update, Dec. 2015:
This is notably shorter, though may not be too-efficient:
$pSeq[index-of($pSeq,.)[max(for $item in $pSeq return count(index-of($pSeq,$item)))]]
The shortest expression can be constructed for XPath 3.1:
And even shorter and copyable -- using a one-character name:
$s[index-of($s,.)[max($s ! count(index-of($s, .)))]]

You are approaching this problem from too much of an imperative standpoint.
In XQuery you can set the values of variables, but you can never change them.
The correct way to do iterative-type algorithms is with a recursive function:
declare funciton local:most($sequence, $index, $value, $count)
{
let $current=$sequence[$index]
return
if (empty($current))
then $value
else
let $current-count = count(index-of($current, $sequence))
return
if ($current-count > $count)
then local:most($sequence, $index+1, $current, $current-count)
else local:most($sequence, $index+1, $value, $count)
}
but a better way of approaching the problem is by describing the problem in a non-iterative way. In this case of all the distinct values in your sequence you want the one that appears maximum number of times of any distinct value.
The previous sentance translated into XQuery is
let $max-count := max(for $value1 in distinct-values($sequence)
return count(index-of($sequence, $value1)))
for $value2 in distinct-values($sequence)
where (count(index-of($sequence, $value2)) = $max-count
return $value2

Related

XQuery - wrong indexes in substring after reverse-string function use

Im trying to implement base64 coding in a very simple way. In my approach (lets for a second put away whether its appropriate or not) I need to reverse strings and then concate them. After that this concated string is used in substring function. Strings are joined properly but when I use substring basex seems to lose it.
Funny thing is substring works for well for all indexes starting at 8. So substring($string, 1, 8) and higher gives correct output. But everything below that is messed up. Starting with one disappeared number: substring($string, 1, 7 (and below) ) results in 6 length string.
Moreover substring can start only with 1st or 0 index. Anything greater results in empty return.
declare variable $array := [];
declare function bs:encode
( $input as xs:string ) {
bs:integer-to-binary(string-to-codepoints($input), "", $array)
} ;
declare function bs:integer-to-binary
( $input as xs:integer*, $string as xs:string, $array as array(xs:string) ) {
let $strings :=
for $i in $input
return
if ($i != 0)
then if ($i mod 2 = 0)
then bs:integer-to-binary(xs:integer($i div 2), concat($string, 0), $array)
else bs:integer-to-binary(xs:integer($i div 2), concat($string, 1), $array)
else if ($i <= 0)
then array:append($array, $string)
return bs:check-if-eight($strings)
} ;
declare function bs:check-if-eight
( $strings as item()+ ) {
let $fullBinary :=
for $string in $strings
return if (string-length($string) < 8)
then bs:check-if-eight(concat($string, 0))
else $string (: add as private below :)
return bs:concat-strings($fullBinary)
} ;
declare function bs:concat-strings
( $strings as item()+ ) {
let $firstStringToConcat := functx:reverse-string($strings[position() = 1])
let $secondStringToConcat := functx:reverse-string($strings[position() = 2])
let $thirdStringToConcat := functx:reverse-string($strings[position() = 3])
let $concat :=
concat
($firstStringToConcat,
$secondStringToConcat,
$thirdStringToConcat)
(: this returns correct string of binary value for Cat word :)
return bs:divide-into-six($concat)
} ;
declare function bs:divide-into-six
( $binaryString as xs:string) {
let $sixBitString := substring($binaryString, 1, 6)
(: this should return 010000 instead i get 000100 which is not even in $binaryString at all :)
return $sixBitString
} ;
bs:encode("Cat")
I expect first six letters from string (010000) instead I get some random sequence I guess (00100). The whole module is meant to encode strings into base64 format but for now (the part i uploaded) should just throw first six bits for 'C'
Alright so I figured it out I guess.
First of all in function concat-strings I changed concat to fn:string-join. It allowed me to pass as an argument symbol that separates joined strings.
declare function bs:concat-strings ( $strings as item()+ ) {
let $firstStringToConcat := xs:string(functx:reverse-string($strings[position() = 1]))
let $secondStringToConcat := xs:string(functx:reverse-string($strings[position() = 2]))
let $thirdStringToConcat := xs:string(functx:reverse-string($strings[position() = 3]))
let $concat :=
****fn:string-join(****
($firstStringToConcat,
$secondStringToConcat,
$thirdStringToConcat),****'X'****)
return bs:divide-into-six($concat) } ;
I saw that my input looked like this:
XXXXXXXX01000011XXXXXXXXXXXXXXXXX01100001XXXXXXXXXXXXXXXXX01110100XXXXXXXX
Obviously it had to looping somewhere without clear for loop and as I novice to Xquery i must have been missed that. And indeed. I found it in check-if-eight function:
> declare function bs:check-if-eight ( $strings as item()+ ) {
> **let $fullBinary :=**
> for $string in $strings
> return if (string-length($string) < 8)
> then bs:check-if-eight(concat($string, 0))
> else $string (: add as private below :)
> **return bs:concat-strings($fullBinary)** } ;
Despite being above FOR keyword, $fullBinary variable was in a loop and produced empty spaces(?) and it was clearly shown when i used X as a separator.
DISCLAIMER: I thought about this before and used functx:trim but for some reason it doesnt work like I expected. So it might not for you too if having similar issue.
At this point it was clear that let $fullBinary cannot be bided in FLWR statement at least can't trigger concat-strings function. I changed it and now it produces only string and now im trying to figure out new sequence of running whole module but I think the main problem here is solved.

Removing consecutive numbers from a sequence in XQuery

XQuery
Input: (1,2,3,4,5,6,7,14,15,16,17,24,25,26,27,28)
Output: (1,7,14,17,24,28)
I tried to remove consecutive numbers from the input sequence using the XQuery functions but failed doing so
xquery version "1.0" encoding "utf-8";
declare namespace ns1="http://www.somenamespace.org/types";
declare variable $request as xs:integer* external;
declare function local:func($reqSequence as xs:integer*) as xs:integer* {
let $nonRepeatSeq := for $count in (1 to count($reqSequence)) return
if ($reqSequence[$count+1] - $reqSequence) then
remove($reqSequence,$count+1)
else ()
return
$nonRepeatSeq
};
local:func((1,2,3,4,5,6,7,14,15,16,17,24,25,26,27,28))
Please suggest how to do so in XQuery functional language.
Two simple ways to do this in XQuery. Both rely on being able to assign the sequence of values to a variable, so that we can look at pairs of individual members of it when we need to.
First, just iterate over the values and select (a) the first value, (b) any value which is not one greater than its predecessor, and (c) any value which is not one less than its successor. [OP points out that the last value also needs to be included; left as an exercise for the reader. Or see Michael Kay's answer, which provides a terser formulation of the filter; DeMorgan's Law strikes again!]
let $vseq := (1,2,3,4,5,6,7,14,15,16,17,24,25,26,27,28)
for $v at $pos in $vseq
return if ($pos eq 1
or $vseq[$pos - 1] ne $v - 1
or $vseq[$pos + 1] ne $v + 1)
then $v
else ()
Or, second, do roughly the same thing in a filter expression:
let $vseq := (1,2,3,4,5,6,7,14,15,16,17,24,25,26,27,28)
return $vseq[
for $i in position() return
$i eq 1
or . ne $vseq[$i - 1] + 1
or . ne $vseq[$i + 1] - 1]
The primary difference between these two ways of performing the calculation and your non-working attempt is that they don't say anything about changing or modifying the sequence; they simply specify a new sequence. By using a filter expression, the second formulation makes explicit that the result will be a subsequence of $vseq; the for expression makes no such guarantee in general (although because for each value it returns either the empty sequence or the value itself, we can see that here too the result will be a subsequence: a copy of $vseq from which some values have been omitted.
Many programmers find it difficult to stop thinking in terms of assignment to variables or modification of data structures, but its worth some effort.
[Addendum] I may be overlooking something, but I don't see a way to express this calculation in pure XPath 2.0, since XPath 2.0 seems not to have any mechanism that can bind a variable like $vseq to a non-singleton sequence of values. (XPath 3.0 has let expressions, so it's not a challenge there. The second formulation above is itself pure XPath 3.0.)
In XSLT this can be done as:
<xsl:for-each-group select="$in" group-adjacent=". - position()">
<xsl:sequence select="current-group()[1], current-group()[last()]"/>
</xsl:for-each-group>
In XQuery 3.0 you can do it with tumbling windows, but I'm too lazy to work out the detail.
An XPath 2.0 solution (assuming the input sequence is in $in) is:
for $i in 1 to count($in)
return $in[$i][not(. eq $in[$i - 1]+1 and . eq $in[$i+1]-1)]
There are several logic and XQuery usage errors in your solution, but the main problem with it is that variables in XQuery are immutable, so you cannot reassign a value to one once assigned. Therefore, it's often easier to think about these types of problems in terms of recursive solutions:
declare function local:non-consec(
$prev as xs:integer?,
$rest as xs:integer*
) as xs:integer*
{
if (empty($rest)) then ()
else
let $curr := head($rest)
let $next := subsequence($rest, 2, 1)
return (
if ($prev eq $curr - 1 and $curr eq $next - 1)
then () (: This number is part of a consecutive sequence :)
else $curr,
local:non-consec(head($rest), tail($rest))
)
};
local:non-consec((), (1,2,3,4,5,6,7,14,15,16,17,24,25,26,27,28))
=>
1
7
14
17
24
28

Build dictionary in XQuery for loop and count occurrences of similar nodes

Im trying to count occurrences of a string during a for loop in a dictionary (baseX map). It seems that the contents of the dictionary are cleared after each iteration. Is there a way to keep the info throughout the loop?
declare variable $dict as map(*) := map:merge(());
for $x at $cnt in //a order by -$cnt
let $l := (if (map:contains($dict, $x/#line)) then (fn:number(map:get($dict, $x/#line))) else (0))
let $dict := map:put($dict, $x/#line, 1 + $l)
return (
$dict,
if ($x[#speaker="player.computer" or #speaker = "event.object"])
then ( <add sel="(//{fn:name($x)}[#line='{$x/#line}'])[{fn:string(map:get($dict, $x/#line))}]" type="#hidechoices">false</add> )
else ( <remove sel="(//{fn:name($x)}[#line='{$x/#line}'])[1]" />)
)
so for this xml:
<a line="x" />
<a line="y" />
<a line="y" />
<a line="z" />
i should get something like this for the first:
{
"x": 1
}
and this for the last iteration:
{
"x": 1,
"y": 2,
"z": 1
}
I have to construct some text out of this in the end, thats the last part of the output.
Right now i only get the current key/value pairs at each iteration, so $dict has only one entry throughout the whole execution, and $l is always 0.
Thankfully this worked:
for $x at $cnt in //a
let $dict := map:merge((
for $y at $pos in //a
let $line := $y/#line
where $pos <= $cnt
group by $line
return map:entry($line, count($y))
))
return (
$dict,
if ($x[#speaker="player.computer" or #speaker = "event.object"])
then ( <add sel="(//{fn:name($x)}[#line='{$x/#line}'])[{fn:string(map:get($dict, $x/#line))}]" type="#hidechoices">false</add> )
else ( <remove sel="(//{fn:name($x)}[#line='{$x/#line}'])[1]" />)
)
For some reason could not use position() to limit the inner for, it returned all nodes right at first iteration.
Thanks a lot for your help!
Your whole approach is flawed. XQuery is a functional language and the way you describe your problem and you wrote your query indicates that you not yet fully grasp the functional programming paradigm (which is fully understandable, as it is quite different from procedural programming). I would suggest you read into the topic in general.
Instead of iterating over all elements in a procedural way you can user a FLWOR expression with group by:
let $map := map:merge((
for $x in //a
let $line := $x/#line
group by $line
return map:entry($line, count($x))
))
This holds the result you expected. It iterates over the a elements and groups them together by their line attribute.
Another remark: Your output XML in the sel attribute looks suspiciously like the path to a certain element. Are you aware of the fn:path function, which gives you exactly that?
Based on your update from the comments you can calculate the map multiple times, but just up to the current position:
for $y at $pos in //a
let $map := map:merge((
for $x in //a[position() <= $pos]
let $line := $x/#line
group by $line
return map:entry($line, count($x))
))
return $map

assigning an operator to a variable in xquery

is there a way to assign a numeric operator to a variable in Xquery?
I have to perform an arithmetic expression on a given pair of values depending upon a node tag.
I've managed to do this but its resulted in a lot of duplicate code. I'd like to simplify the query so that instead of:
Function for Add
Repeated if code - this calls out to other functions but is still repeated
$value1 + $value2
Function for Minus
Repeated if code
$value1 - $value2
etc for multiply, div etc
I'd like to set up a function and send a variable to it, something similar to this:
$value1 $operator $value2
Is there a simple way to do this in xquery?
thank you for your help.
If your query processor supports XQuery 3.0, you can use function items for that:
declare function local:foo($operator, $x, $y) {
let $result := $operator($x, $y)
return 2 * $result
};
local:foo(...) can then be called like this:
let $plus := function($a, $b) { $a + $b },
$mult := function($a, $b) { $a * $b }
return (
local:foo($plus, 1, 2),
local:foo($mult, 3, 4)
)
Why don't you use a simple if-else construct? E.g.
if (repeated code says you should add) then
$value1 + $value2
else
$value1 - $value2
You could also simple put the repeated code in another function instead of copying the code.

Updating counter in XQuery

I want to create a counter in xquery. My initial attempt looked like the following:
let $count := 0
for $prod in $collection
let $count := $count + 1
return
<counter>{$count }</counter>
Expected result:
<counter>1</counter>
<counter>2</counter>
<counter>3</counter>
Actual result:
<counter>1</counter>
<counter>1</counter>
<counter>1</counter>
The $count variable either failing to update or being reset. Why can't I reassign an existing variable? What would be a better way to get the desired result?
Try using 'at':
for $d at $p in $collection
return
element counter { $p }
This will give you the position of each '$d'. If you want to use this together with the order by clause, this won't work since the position is based on the initial order, not on the sort result. To overcome this, just save the sorted result of the FLWOR expression in a variable, and use the at clause in a second FLWOR that just iterates over the first, sorted result.
let $sortResult := for $item in $collection
order by $item/id
return $item
for $sortItem at $position in $sortResult
return <item position="{$position}"> ... </item>
As #Ranon said, all XQuery values are immutable, so you can't update a variable. But if you you really need an updateable number (shouldn't be too often), you can use recursion:
declare function local:loop($seq, $count) {
if(empty($seq)) then ()
else
let $prod := $seq[1],
$count := $count + 1
return (
<count>{ $count }</count>,
local:loop($seq[position() > 1], $count)
)
};
local:loop($collection, 0)
This behaves exactly as you intended with your example.
In XQuery 3.0 a more general version of this function is even defined in the standard library: fn:fold-right($f, $zero, $seq)
That said, in your example you should definitely use at $count as shown by #tohuwawohu.
Immutable variables
XQuery is a functional programming language, which involves amongst others immutable variables, so you cannot change the value of a variable. On the other hand, a powerful collection of functions is available to you, which solves lots of daily programming problems.
let $count := 0
for $prod in $collection]
let $count := $count + 1
return
<counter>{$count }</counter>
let $count in line 1 defines this variable in all scope, which are all following lines in this case. let $count in line 3 defines a new $count which is 0+1, valid in all following lines within this code block - which isn't defined. So you indeed increment $count three times by one, but discard the result immediatly.
BaseX' query info shows the optimized version of this query which is
for $prod in $collection
return element { "counter" } { 1 }
The solution
To get the total number of elements in $collection, you can just use
return count($collection)
For a list of XQuery functions, you could have a look at the XQuery part of functx which contains both a list of XQuery functions and also some other helpful functions which can be included as a module.
Specific to MarkLogic you can also use xdmp:set. But this breaks functional language assumptions, so use it conservatively.
http://docs.marklogic.com/5.0doc/docapp.xqy#display.xqy?fname=http://pubs/5.0doc/apidoc/ExsltBuiltins.xml&category=Extension&function=xdmp:set
For an example of xdmp:set in real-world code, the search parser https://github.com/mblakele/xqysp/blob/master/src/xqysp.xqy might be helpful.
All the solution above are valid but I would like to mention that you can use the XQuery Scripting extension to set variable values:
variable $count := 0;
for $prod in (1 to 10)
return {
$count := $count + 1;
<counter>{$count}</counter>
}
You can try this example live at http://www.zorba-xquery.com/html/demo#twh+3sJfRpHhZR8pHhOdsmqOTvQ=
Use xdmp:set instead of the below query
let $count := 0
for $prod in (1 to 4)
return ( xdmp:set($count,number($count+1)) ,<counter>{$count }</counter>
I think you are looking for something like:
XQUERY:
for $x in (1 to 10)
return
<counter>{$x}</counter>
OUTPUT:
<counter>1</counter>
<counter>2</counter>
<counter>3</counter>
<counter>4</counter>
<counter>5</counter>
<counter>6</counter>
<counter>7</counter>
<counter>8</counter>
<counter>9</counter>
<counter>10</counter>

Resources