perl6: access values in a multidimensional variable - multidimensional-array

Perl6 Twitter module gives a multidimensional variable with the tweets from a search query. This code:
%tweets<statuses>[0]<metadata><iso_language_code>.say;
%tweets<statuses>[0]<created_at>.say;
prints:
es
Fri May 04 13:54:47 +0000 2018
The following code prints the 'created_at' value of the tweets from the search query.
for #(%tweets<statuses>) -> $tweet {
$tweet<created_at>.say;
}
Is there a better syntax to access the values of the variable %tweets?
Thanks!

If the question is whether there is a shorter syntax for hash indexing with literal keys than <...>, then no, that's as short as it gets. In Perl 6, there's no conflation of the hash data structure with object methods/attributes/properties (unlike with JavaScript, for example, where there is no such distinction, so . is used for both).
There are plenty of ways to get rid of repetition and boilerplate, however. For example:
%tweets<statuses>[0]<metadata><iso_language_code>.say;
%tweets<statuses>[0]<created_at>.say;
Could be written instead as:
given %tweets<statuses>[0] {
.<metadata><iso_language_code>.say;
.<created_at>.say;
}
This is using the topic variable $_. For short, simple, loops, that can also be used, like this:
for #(%tweets<statuses>) {
.<created_at>.say;
}

Related

Why fn:substring-after Xquery function could not be used inside ML TDE

In my ML db, we have documents with distributor code like 'DIST:5012' (DIST:XXXX) XXXX is a four-digit number.
currently, in my TDE, the below code works well.
However instead of concat all the raw distributor codes, I want to simply concat the number part only. I used the fn:substring-after XQuery function. However, it won't work. It won't show that distributorCode column in the SQL View anymore. (Below code does not work.)
What is wrong? How to fix that?
Both fn:substring-after and fn:string-join is in TDE Dialect page.
https://docs.marklogic.com/9.0/guide/app-dev/TDE#id_99178
substring-after() expects a single string as input, not a sequence of strings.
To demonstrate, this will not work:
let $dist := ("DIST:5012", "DIST:5013")
return substring-after($dist, "DIST:")
This will:
for $dist in ("DIST:5012", "DIST:5013")
return substring-after($dist, "DIST:")
I need to double check what XPath expressions will work in a DTE, you might be able to change it to apply the substring-after() function in the last step:
fn:string-join( distributors/distributor/urn/substring-after(., 'DIST:'), ';')

An XDMP-NOTANODE error using xquery in marklogic

I'm getting the XDMP-NOTANODE error when I try to run an XQuery in MarkLogic. When I loaded my xml documents I loaded meta data files with them. I'm a student and I don't have experience in XQuery.
error:
[1.0-ml] XDMP-NOTANODE: (err:XPTY0019) $article/article/front/article-meta/title-group/article-title -- xs:untypedAtomic("
") is not a node
Stack Trace
At line 3 column 77:
In xdmp:eval("(for $article in fn:distinct-values(/article/text()) &#1...", (), <options xmlns="xdmp:eval"><database>4206169969988859108</database> <root>C:\mls-projects\pu...</options>)
$article := xs:untypedAtomic("
")
1. (for $article in fn:distinct-values(/article/text())
2.
3. return (fn:distinct-values($article/article/front/article-meta/title-group/article-title)
4.
5.
Code:
(
for $article in fn:distinct-values(/article/text())
return (
fn:distinct-values($article/article/front/article-meta/title-group/article-title/text())
)
)
Every $article is bound to an atomic value (fn:distinct-values() returns a sequence of atomic values). Then you try to apply a path expression (using the / operator) on $article. Which is forbidden, as the path operator requires its LHS operator to be nodes.
I am afraid your code does not make sense enough for me to suggest you an actual solution. I can only pinpoint where the error is.
Furthermore, using text() at the end of a path is most of the time a bad idea. And if /article is a complex document, it is certainly not what you want. One of the text nodes you select (most likely the first one) is simply one single newline character.
What do you want to achieve?
Your $article variable is bound to an atomic value, not a node() from the article document. You can only use an XPath axis on a node.
When you apply the function distinct-values() in the for statement, it returns simple string values, not the article document or nodes from it.
You can probably make things work by using the values in a predicate filter like this:
for $article-text in fn:distinct-values(/article/text())
return
fn:distinct-values(/article[text()=$article-text]/front/article-meta/title-group/article-title/text())
Note: The above XQuery should avoid the XDMP-NOTANODE error, but there are likely easier (and more efficient) solutions for achieving your goal. If you were to post a sample of your document and describe what you are trying to achieve, we could suggest alternatives.
Bit of a wild guess, but you have two distinct-values in your code. That makes me think you want a unique list of articles, and then finally a unique list of article-title's. I would hope you already have unique articles in your database, unless you are explicitly attempting to de-duplicate them.
In case you just want the overall unique list of article titles, I would do something like:
distinct-values(
for $article in collection()/article
return
$article/front/article-meta/title-group/article-title
)
HTH!

Marklogic collate sequence in XQuery

Is there a way to modify the elements a sequence so only collated versions of the items are returned?
let $currencies := ('dollar', 'Dollar', 'dollar ')
return fn:collated-only($currencies, "http://marklogic.com/collation/en/S1/T00BB/AS")
=> ('dollar', 'dollar', 'dollar')
The values that are stored in the range index (that feeds the facets) are literally the first value that was encountered that compared equal to the others. (Because, the collation says you don't care...)
You can get a long way by calling
fn:replace(fn:lower-case(xdmp:diacritic-less(fn:normalize-unicode($str,"NFKC"))),"\p{P}","")
This won't be exactly the same in that it overfolds some things and underfolds others, but it may be good for your purposes.
Is this the expected output? There is no fn:collated-only function, so I'm assuming you're asking how to write such a function or whether there is such a function.
The thing is, there isn't a mapping from one string to another in collation comparisons, there is only a comparison algorithm (the Unicode Collation Algorithm) so there really is no canonical kind of string to return to you, and therefore no API to do so.
Stepping back, what is the problem you are actually trying to solve? By the rules of that collation, "dollar" and "Dollar" are equivalent, and by using it you declare you don't care which form you use, so you could use either one.
If these values are in XML elements and you have a range index using http://marklogic.com/collation/en/S1/T00BB/AS, you can do something like this:
let $ref := cts:element-reference(xs:QName("currency"), "collation=http://marklogic.com/collation/en/S1/T00BB/AS")
for $curr in cts:values($ref, (), "frequency-order")
return $curr || ": " || cts:frequency($curr)
This will produce results like:
"dollar: 15",
"euro: 12"
... and so on. The collation will disregard the differences among your sample inputs. These results could be formatted however you want. Is that what you're looking to do?

R data table issue

I'm having trouble working with a data table in R. This is probably something really simple but I can't find the solution anywhere.
Here is what I have:
Let's say t is the data table
colNames <- names(t)
for (col in colNames) {
print (t$col)
}
When I do this, it prints NULL. However, if I do it manually, it works fine -- say a column name is "sample". If I type t$"sample" into the R prompt, it works fine. What am I doing wrong here?
You need t[[col]]; t$col does an odd form of evaluation.
edit: incorporating #joran's explanation:
t$col tries to find an element literally named 'col' in list t, not what you happen to have stored as a value in a variable named col.
$ is convenient for interactive use, because it is shorter and one can skip quotation marks (i.e. t$foo vs. t[["foo"]]. It also does partial matching, which is very convenient but can under unusual circumstances be dangerous or confusing: i.e. if a list contains an element foolicious, then t$foo will retrieve it. For this reason it is not generally recommended for programming.
[[ can take either a literal string ("foo") or a string stored in a variable (col), and does not do partial matching. It is generally recommended for programming (although there's no harm in using it interactively).

rewrite rules that converts tokens to integer parameters

After much wrestling with the idea of ranking records, I finally settled on numeric based scores for my documents, which I emit to have them sorted based on these scores.
Now these numbers have meaning, where the 1st 2 digits represent a specific type of document.
Therefore, to get documents of type 22 sorted based on their scores, I simply query the view with start key being 220000 and end key being 229999
This is all great and works, my problems occur when I try to use url rewrites.
I'm basically trying to reroute:
/_rewrite/rankings/{doctype}
to
/_list/rankings?startkey=xx0000&endkeyxx9999
where xx is the {doctype}
my issue is with specifying rewrite rule:
[
{ "from":"rankings/:doctype",
"to":"_list/rankings",
"query": ??? //what will this be?
]
How can I construct the start and end keys by appending 0000 and 9999 respectively?
how can I specify a numeric value? since using place holder ":doctype" will result in a string type rather than a numberic type, resulting in a failed query even if I were to modify my pretty url to input both start and end keys.
I worked around the issue by filtering the results in my list view (ignoring docs im not interested in from getRow()), my concern here, should I worry about efficiency of list function now?
feel free to comment also on my sorting strategy .. would be interested to know how others solved their sorting and slicing problems with couchdb
Solution
First, you should emit the type and the score separately in an array instead of concatenating them:
emit([doc.type, doc.score], doc);
Then you can rewrite like this
[
{
"from" : "rankings/:doctype",
"to" : "_list/rankings/rankings",
"query" : {
"startkey" : [":doctype", 0],
"endkey" : [":doctype", 9999]
},
"formats": {
"doctype" : "int"
}
}
]
I tested it on CouchDB 1.1.1 and it works.
Reference
The relevant documentation is buried in this issue on JIRA: COUCHDB-1074
As you can see, the issue was resolved on April 2011, so it should work in CouchDB 1.0.3 and above.

Resources