Finding out closest date from today - xquery

I have defined $today = current system date.
Objective: To find out the value of closest "date" element from today and get the corresponding "price_per_share" value based on below xml. FYI, it does not matter whether the date is in past or in future when calculating the "closest date".
How do I do this?
I was thinking of
Calculating the difference between today and each "date" value
Then adding "difference" as the child of "details" for each "details" node
Then sort by the smallest "duration"
Then extract the first occurrence of "price_per_share" value.
But somehow adding a child does not seem to work for me either.
<test_record>
<details>
<date>2013-01-16</date>
<currency>USD</currency>
<shares>48</shares>
<price_per_share>20</price_per_share>
</details>
<details>
<date>2018-05-28</date>
<currency>USD</currency>
<shares>49</shares>
<price_per_share>30</price_per_share>
</details>
<details>
<date>2018-10-25</date>
<currency>USD</currency>
<shares>50</shares>
<price_per_share>40</price_per_share>
</details>
<details>
<date>2018-05-02</date>
<currency>USD</currency>
<shares>51</shares>
<price_per_share>60</price_per_share>
</details>
Can't use "map" as I am restricted to use xquery 1.0.

The steps you describe sound good, except for the second one. There is no need to append anything as you don't actually want to modify the nodes. In fact, you can't do this with XQuery alone (XQuery Update is another specification which is intended for this). Also, there is no need to use a map here. Your logic sounds like you know procedural programming, but the concept of a functional programming language (XQuery being one) is quite different, so you might want to familiarize yourself with it.
Regarding you question: You first calculate the difference between the date and todays date. As you said it shouldn't matter whether this is in the past or the future, so we have to calculate the absolute value of the duration. To do this we first divide by seconds and thus get the number of seconds for this duration. As we now have a number we can get the absolute value in $abs-diff. We then order by this value, get the first element of the sequence and return price_per_share
let $today := fn:current-date()
return (
for $detail in //details
let $diff := xs:date($detail/date) - $today
let $abs-diff := abs($diff div xs:dayTimeDuration('PT1S'))
order by $abs-diff
return $detail
)[1]/price_per_share

Related

Marklogic: Find documents containing elements without a particular attribute (maybe many per document)

I have some data which looks something like this:
<wrapper>
<inner a="1"/>
<inner a="2" b="3"/>
</wrapper>
The attribute b may or may not be present on each inner element. My aim is to find all documents containing at least one inner element that doesn't have attribute b.*
This similar question proposes the answer:
cts:not-query(cts:element-attribute-value-query(xs:QName('inner'), xs:QName('b'), '*', ("wildcarded"))))
but that doesn't work, because some inner elements on the same document may have attribute b, and not-queries work on the entire fragment, so a mixed case like the example above would not be returned. Wrapping it in an element-query doesn't help, and cts:and-not-query seems to behave the same way.
I have also tried attacking the problem using co-occurrence/values functions to read the values of relevant attributes a, but that also seems to be impossible. It might have been possible with proximity settings on co-occurrences calls except there is no element text, so the attribute are indexed with the same word positions.
Are there any alternatives to the blunt xpath?
//inner[#a and not(#b)]
You can always make the xpath more complicated if simplicity isnt your goal.
How about this one: (it more accurately answers the exact question of 'return all documents that contain 'innner' elements that do not have an atribute #b'
doc()[exists(//inner[not(#b)])]
I do not know how well this is optimized -- some xpath expressions optimize down to the equivalent cts: query and some do not.
There is another 'trick' involving combining cts expressions represented as maps. Take the results of 2 searches, use the options that return the results as a map, then you can use the operations on this page https://developer.marklogic.com/blog/im-a-map to do extremely efficient set operations (union, intersection, difference etc). When properly constructed, this technique can be as fast as 'native' cts searches --- the cts searches use the same general technique internally for resolving results.
Make the XPath a path range index. //inner[#a and not(#b)], or if there's no element text, //inner[#a and not(#b)]/#a, then do
cts:path-range-query('//inner[#a and not(#b)]/#a','>','')
This happens to also allow us to efficiently answer the question of which #a values have a missing #b, using cts:values.
cts:not-in-query has the necessary behaviour to make this work where cts:and-not-query doesn’t. E.g.
cts:not-in-query(
cts:element-query(xs:QName('inner'), cts:true-query()),
cts:element-attribute-query(xs:QName('inner'), xs:QName('b'),'*','wildcarded')
)
Finds all ‘inner’ elements at positions that do not match the positions of ‘inner’ elements with attribute b.
Element position index must be enabled. Wildcard index must be enabled.
http://docs.marklogic.com/cts:not-in-query

Xquery - Using Max and Min

I have been doing a homework exercise in Xquery and I am currently stuck. The assignment was to generate infromation in regards to the two continents, that after 50 years would have the largest respectively smallest population increase. All I have left is to take the min and max, it is all saved into $minAndMaxCont and it looks like this:
<Continent name="asia" pop="4243769598" futurePop="7255593125" increase="3011823527" ratio="1.709704770122159681"/>
<Continent name="africa" pop="1043912572" futurePop="3405022718" increase="2361110146" ratio="3.261789166382412339"/>
<Continent name="america" pop="955621605" futurePop="1510928928" increase="555307323" ratio="1.581095404388643976"/>
<Continent name="australia" pop="93146473" futurePop="156995765" increase="63849292" ratio="1.685471923343785653"/>
<Continent name="europe" pop="633227105" futurePop="693248396" increase="60021291" ratio="1.094786357889717939"/>
So what I want to do is to extract the minumum and maximum value in regards to "increase" which seems simple enough. But I do not seem to get it to work, I have tried a lot of different approaches, one such approach being using the max and min functions by looking at other threads as guides.
One thread that I followed was this one:
How can I use XPath to find the minimum value of an attribute in a set of elements?
From there I took this code:
let $xml := <foo>
<bar id="1" score="192" />
<bar id="2" score="227" />
<bar id="3" score="105" />
</foo>
let $min := min($xml/bar/#id)
let $max := max($xml/bar/#id)
return $max
And this works perfectly fine, it will return the min/max value (I can use both Xquery and Datapath solutions btw). However, when I attempt to do something similar inside of my own collection of data, like this:
let $incMin := min($minAndMaxCont/#increase)
return $incMin
The generated result becomes this (It behaves the same way with max() too):
3.011823527E9
2.361110146E9
5.55307323E8
6.3849292E7
6.0021291E7
So instead of extracting a minumum (or maximum) value it converts the whole list into another form and does nothing with it. I really want to get this to work, and also I am genuinely curious as to why it converts the entries into another form instead of extracting the max value. I would very much appreciate any help.
//With kind regards.

Checking last element in a boost::fusion::for_each loop

I want to know if there is a way to check for the last element in a fusion for_each loop (in order to apply special code for this case)
Edit : Maybe a better question should be :
I have played with fusion::for_each, now I want to apply code on each element of a fusion sequence with special code (special code does not mean "extra code" but different code) for the last element. May be I should use iterators (an example please)?
Some ideas:
1) use boost::fusion::fold, count your way though, and on the last one, perform your edit
2) if all types in the tuple are heterogenous, match on type to determine last one
3) include some sort of marker for the last one on which you can match
4) use the 'prior(end(v))' operators to manipulate the last element when for_each processing is complete

What is the best way to determine what articles are available for a given usenet group?

I was wondering what the most efficient way is to get the available articles for a given nntp group. The method I have implemented works as follows:
(i) Select the group:
GROUP group.name.subname
(ii) Get a list of article numbers from the group (pushed back into a vector 'codes'):
LISTGROUP
(iii) Loop over codes and grab articles (e.g. headers)
for code in codes do
HEAD code
end
However, this doesn't scale well with large groups with many article codes.
In RFC 3977, the GROUP command is indicated as also returning the 'low' and 'high' article numbers. For example,
[C] GROUP misc.test
[S] 211 1234 3000234 3002322 misc.test
where 3000234 and 2002322 are the low and high numbers. I'm therefore thinking of using these instead rather than initially pushing back all article codes. But can these numbers be relied upon? Is 3000234 definitely indicative of the first article id in the above-selected group and likewise is 3002322 definitely indicative of the last article id in the above-selected group or are they just estimates?
Many thanks,
Ben
It turns out I was thinking about this all wrong. All I need to do is
(i) set the group using GROUP
(ii) execute the NEXT command followed by HEAD for however many headers I want (up to count):
for c : count do
articleId <-- NEXT
HEAD articleID
end
EDIT: I'm sure there must be a better way but until anyone suggests otherwise I'll assume this way to be the most effective. Cheers.

XQuery - Copy Constructor?

Given the following query
let $a := xs:dateTime("2012-01-01T00:00:00.000+00:00")
let $b := xs:dateTime($a)
let $c := xs:dateTime($a cast as xs:string)
(: cannot - don't know how to - execute the function without assignment :)
let $d := adjust-dateTime-to-timezone($a, xs:dayTimeDuration("PT1H"))
return (<a>{$a}</a>,<b>{$b}</b>,<c>{$c}</c>)
the output is as follows
<a>2012-01-01T01:00:00+01:00</a>
<b>2012-01-01T01:00:00+01:00</b>
<c>2012-01-01T00:00:00Z</c>
Based on XQuery's documentation on constructor functions (the constructor function for a given type is used to convert instances of other atomic types into the given type) this is the expected behaviour. Calling xs:dateTime($a) simply returns $a as there is no need to cast, but xs:dateTime($a cast as xs:string) creates a new xs:string from $a first. However this requires an extra conversion.
Is there any other way to tackle this problem? Or conversions are cheap and I shouldn't care?
(If it makes any difference my XQuery processor is BaseX 7.2.)
It seems it does a make a difference that I'm using BaseX. I've really thought that this is the way the xs:dateTime constructor function and the adjust-dateTime-to-timezone function should be working, this is why I misinterpreted the XQuery documentation.
Given the input I've been given by Dimitre and Ranon it seems the problem described is gone.
By the why my use case is, or more like it was, that I wanted to make a date-time interval based query against my XML data set's date-time element. Because the input parameters and the source date-time values used different time-zones I had to make time-zone corrections with the above function, which modified its input parameter (the original source date-time in my case), however I wanted to preserve the original value. Given the function's name adjust-dateTime I thought that it's okay that it modifies its argument, so I automatically thought that I had to copy my original value using a constructor function to be able to keep the original date-time value.
Looks like you ran into some really weird bug.
Your line 5 shouldn't change $a-c at all as XQuery is a functional programming language with immutable variables (adjust-dateTime-to-timezone should not change your variables) and without side effects. Thats why you were forced to assign $d, otherwise your calculated results directly would have been thrown away.
I just submitted some bug request. Zorba is doing your query right, you can use it for understanding the problem.
BaseX as you preferred XQuery processor will do within few days, too. I or some other BaseX team member will trigger you here as soon as it's fixed.
I guess your problem arised from missunderstanding and wrong behaviour of BaseX and should be solved. Feel free to ask again if anything stayed unclear with your query.
The output that is reported is incorrect.
The correct output (produced running Saxon under oXygen) is:
<a>2012-01-01T00:00:00Z</a>
<b>2012-01-01T00:00:00Z</b>
<c>2012-01-01T00:00:00Z</c>
The adjust-dateTime-to-timezone() function, as any other function cannot modify its arguments -- its effect is only contained in the variable $d -- which you don't use in the return clause.

Resources