rewrite rules that converts tokens to integer parameters - dictionary

After much wrestling with the idea of ranking records, I finally settled on numeric based scores for my documents, which I emit to have them sorted based on these scores.
Now these numbers have meaning, where the 1st 2 digits represent a specific type of document.
Therefore, to get documents of type 22 sorted based on their scores, I simply query the view with start key being 220000 and end key being 229999
This is all great and works, my problems occur when I try to use url rewrites.
I'm basically trying to reroute:
/_rewrite/rankings/{doctype}
to
/_list/rankings?startkey=xx0000&endkeyxx9999
where xx is the {doctype}
my issue is with specifying rewrite rule:
[
{ "from":"rankings/:doctype",
"to":"_list/rankings",
"query": ??? //what will this be?
]
How can I construct the start and end keys by appending 0000 and 9999 respectively?
how can I specify a numeric value? since using place holder ":doctype" will result in a string type rather than a numberic type, resulting in a failed query even if I were to modify my pretty url to input both start and end keys.
I worked around the issue by filtering the results in my list view (ignoring docs im not interested in from getRow()), my concern here, should I worry about efficiency of list function now?
feel free to comment also on my sorting strategy .. would be interested to know how others solved their sorting and slicing problems with couchdb

Solution
First, you should emit the type and the score separately in an array instead of concatenating them:
emit([doc.type, doc.score], doc);
Then you can rewrite like this
[
{
"from" : "rankings/:doctype",
"to" : "_list/rankings/rankings",
"query" : {
"startkey" : [":doctype", 0],
"endkey" : [":doctype", 9999]
},
"formats": {
"doctype" : "int"
}
}
]
I tested it on CouchDB 1.1.1 and it works.
Reference
The relevant documentation is buried in this issue on JIRA: COUCHDB-1074
As you can see, the issue was resolved on April 2011, so it should work in CouchDB 1.0.3 and above.

Related

Using a map element in MATCH statement

I have a query written out where one of the lines is as follows:
[individualNode IN listOfNodes | [(individualNode)-[:CONNECTED_WITH]->(otherNode) | {node:otherNode, similarity:individualNode['similarity']}]] AS connectionMap
listOfNodes is a List of maps
Example of one of the map in the list is
{
"similarity":0.25,
"node":{
"identity":12345,
"labels": [
"Label1",
"Label2"
],
"properties": {
yada..yada..
}
}
The issue here is that since individualNode is a map the statement (individualNode)-[:CONNECTED_WITH]->(otherNode) will fail.
So my question is how do i access the node to use in the match statement, but still retain the map so i can grab the similarity value.
Disclaimer: I know node is a special word in cypher, i only used it here so you guys know what it is i am talking about in the map. That's not how it is in my actual query.
I also change the names of things because i cannot reveal the actual information in the map.
I have tried to write it as (individualNode.node)-[:CONNECTED_WITH]->(otherNode) or (individualNode['node'])-[:CONNECTED_WITH]->(otherNode) but both throw errors too.
Do you want to use it later?
You can access maps with their keys, e.g. map.node
if you want to use that in a pattern you have to alias it with an identifier
e.g. WITH map.node as startNode MATCH (startNode)-->(...)
if you have a list of nodes like in your case you either can walk through that in a pattern comprehension again, like you already did
or you can use UNWIND to turn the list into rows.
UNWIND listOfNodesMaps as map
WITH map.node as startNode, map.similarity as similarity
MATCH (startNode)-->(...)

perl6: access values in a multidimensional variable

Perl6 Twitter module gives a multidimensional variable with the tweets from a search query. This code:
%tweets<statuses>[0]<metadata><iso_language_code>.say;
%tweets<statuses>[0]<created_at>.say;
prints:
es
Fri May 04 13:54:47 +0000 2018
The following code prints the 'created_at' value of the tweets from the search query.
for #(%tweets<statuses>) -> $tweet {
$tweet<created_at>.say;
}
Is there a better syntax to access the values of the variable %tweets?
Thanks!
If the question is whether there is a shorter syntax for hash indexing with literal keys than <...>, then no, that's as short as it gets. In Perl 6, there's no conflation of the hash data structure with object methods/attributes/properties (unlike with JavaScript, for example, where there is no such distinction, so . is used for both).
There are plenty of ways to get rid of repetition and boilerplate, however. For example:
%tweets<statuses>[0]<metadata><iso_language_code>.say;
%tweets<statuses>[0]<created_at>.say;
Could be written instead as:
given %tweets<statuses>[0] {
.<metadata><iso_language_code>.say;
.<created_at>.say;
}
This is using the topic variable $_. For short, simple, loops, that can also be used, like this:
for #(%tweets<statuses>) {
.<created_at>.say;
}

An XDMP-NOTANODE error using xquery in marklogic

I'm getting the XDMP-NOTANODE error when I try to run an XQuery in MarkLogic. When I loaded my xml documents I loaded meta data files with them. I'm a student and I don't have experience in XQuery.
error:
[1.0-ml] XDMP-NOTANODE: (err:XPTY0019) $article/article/front/article-meta/title-group/article-title -- xs:untypedAtomic("
") is not a node
Stack Trace
At line 3 column 77:
In xdmp:eval("(for $article in fn:distinct-values(/article/text()) &#1...", (), <options xmlns="xdmp:eval"><database>4206169969988859108</database> <root>C:\mls-projects\pu...</options>)
$article := xs:untypedAtomic("
")
1. (for $article in fn:distinct-values(/article/text())
2.
3. return (fn:distinct-values($article/article/front/article-meta/title-group/article-title)
4.
5.
Code:
(
for $article in fn:distinct-values(/article/text())
return (
fn:distinct-values($article/article/front/article-meta/title-group/article-title/text())
)
)
Every $article is bound to an atomic value (fn:distinct-values() returns a sequence of atomic values). Then you try to apply a path expression (using the / operator) on $article. Which is forbidden, as the path operator requires its LHS operator to be nodes.
I am afraid your code does not make sense enough for me to suggest you an actual solution. I can only pinpoint where the error is.
Furthermore, using text() at the end of a path is most of the time a bad idea. And if /article is a complex document, it is certainly not what you want. One of the text nodes you select (most likely the first one) is simply one single newline character.
What do you want to achieve?
Your $article variable is bound to an atomic value, not a node() from the article document. You can only use an XPath axis on a node.
When you apply the function distinct-values() in the for statement, it returns simple string values, not the article document or nodes from it.
You can probably make things work by using the values in a predicate filter like this:
for $article-text in fn:distinct-values(/article/text())
return
fn:distinct-values(/article[text()=$article-text]/front/article-meta/title-group/article-title/text())
Note: The above XQuery should avoid the XDMP-NOTANODE error, but there are likely easier (and more efficient) solutions for achieving your goal. If you were to post a sample of your document and describe what you are trying to achieve, we could suggest alternatives.
Bit of a wild guess, but you have two distinct-values in your code. That makes me think you want a unique list of articles, and then finally a unique list of article-title's. I would hope you already have unique articles in your database, unless you are explicitly attempting to de-duplicate them.
In case you just want the overall unique list of article titles, I would do something like:
distinct-values(
for $article in collection()/article
return
$article/front/article-meta/title-group/article-title
)
HTH!

Marklogic collate sequence in XQuery

Is there a way to modify the elements a sequence so only collated versions of the items are returned?
let $currencies := ('dollar', 'Dollar', 'dollar ')
return fn:collated-only($currencies, "http://marklogic.com/collation/en/S1/T00BB/AS")
=> ('dollar', 'dollar', 'dollar')
The values that are stored in the range index (that feeds the facets) are literally the first value that was encountered that compared equal to the others. (Because, the collation says you don't care...)
You can get a long way by calling
fn:replace(fn:lower-case(xdmp:diacritic-less(fn:normalize-unicode($str,"NFKC"))),"\p{P}","")
This won't be exactly the same in that it overfolds some things and underfolds others, but it may be good for your purposes.
Is this the expected output? There is no fn:collated-only function, so I'm assuming you're asking how to write such a function or whether there is such a function.
The thing is, there isn't a mapping from one string to another in collation comparisons, there is only a comparison algorithm (the Unicode Collation Algorithm) so there really is no canonical kind of string to return to you, and therefore no API to do so.
Stepping back, what is the problem you are actually trying to solve? By the rules of that collation, "dollar" and "Dollar" are equivalent, and by using it you declare you don't care which form you use, so you could use either one.
If these values are in XML elements and you have a range index using http://marklogic.com/collation/en/S1/T00BB/AS, you can do something like this:
let $ref := cts:element-reference(xs:QName("currency"), "collation=http://marklogic.com/collation/en/S1/T00BB/AS")
for $curr in cts:values($ref, (), "frequency-order")
return $curr || ": " || cts:frequency($curr)
This will produce results like:
"dollar: 15",
"euro: 12"
... and so on. The collation will disregard the differences among your sample inputs. These results could be formatted however you want. Is that what you're looking to do?

How can I prevent SQLite from treating a string as a number?

I would like to query an SQLite table that contains directory paths to find all the paths under some hierarchy. Here's an example of the contents of the column:
/alpha/papa/
/alpha/papa/tango/
/alpha/quebec/
/bravo/papa/
/bravo/papa/uniform/
/charlie/quebec/tango/
If I search for everything under /bravo/papa/, I would like to get:
/bravo/papa/
/bravo/papa/uniform/
I am currently trying to do this like so (see below for the long story of why I can't use more simple methods):
SELECT * FROM Files WHERE Path >= '/bravo/papa/' AND Path < '/bravo/papa0';
This works. It looks a bit weird, but it works for this example. '0' is the unicode code point 1 greater than '/'. When ordered lexicographically, all the paths starting with '/bravo/papa/' compare greater than it and less than 'bravo/papa0'. However, in my tests, I find that this breaks down when we try this:
SELECT * FROM Files WHERE Path >= '/' AND Path < '0';
This returns no results, but it should return every row. As far as I can tell, the problem is that SQLite is treating '0' as a number, not a string. If I use '0Z' instead of '0', for example, I do get results, but I introduce a risk of getting false positives. (For example, if there actually was an entry '0'.)
The simple version of my question is: is there some way to get SQLite to treat '0' in such a query as the length-1 string containing the unicode character '0' (which should sort strings such as '!', '*' and '/', but before '1', '=' and 'A') instead of the integer 0 (which SQLite sorts before all strings)?
I think in this case I can actually get away with special-casing a search for everything under '/', since all my entries will always start with '/', but I'd really like to know how to avoid this sort of thing in general, as it's unpleasantly surprising in all the same ways as Javascript's "==" operator.
First approach
A more natural approach would be to use the LIKE or GLOB operator. For example:
SELECT * FROM Files WHERE Path LIKE #prefix || '%';
But I want to support all valid path characters, so I would need to use ESCAPE for the '_' and '%' symbols. Apparently this prevents SQLite from using an index on Path. (See http://www.sqlite.org/optoverview.html#like_opt ) I really want to be able to benefit from an index here, and it sounds like that's impossible using either LIKE or GLOB unless I can guarantee that none of their special characters will occur in the directory name, and POSIX allows anything other than NUL and '/', even GLOB's '*' and '?' characters.
I'm providing this for context. I'm interested in other approaches to solve the underlying problem, but I'd prefer to accept an answer that directly addresses the ambiguity of strings-that-look-like-numbers in SQLite.
Similar questions
How do I prevent sqlite from evaluating a string as a math expression?
In that question, the values weren't quoted. I get these results even when the values are quoted or passed in as parameters.
EDIT - See my answer below. The column was created with the invalid type "STRING", which SQLite treated as NUMERIC.
* Groan *. The column had NUMERIC affinity because it had accidentally been specified as "STRING" instead of "TEXT". Since SQLite didn't recognize the type name, it made it NUMERIC, and because SQLite doesn't enforce column types, everything else worked as expected, except that any time a number-like string is inserted into that column it is converted into a numeric type.

Resources