effectively accessing first item in object - jq

On input consider db-dump(from dbeaver), having this format:
{
"select": [
{<row1>},
{<row2>}
],
"select": {}
}
say that I'm debugging bigger script, and just want to see first few rows, from first statement. How to do that effectively in rather huge file?
Template:
jq 'keys[0] as $k|.[$k]|limit(1;.[])' dump
isn't really great, as it need to fetch all keys first. Template
jq '.[0]|limit(1;.[])' dump
sadly does not seem to be valid one, and
jq 'first(.[])|limit(1;.[])' dump
does not seem to have any performance benefit.
What would be the best way to just access first field in object without actually testing it's name or caring for rest of fields?

One strategy would be to use the —stream command-line option. It’s a bit tricky to use, but if you want to use jq or gojq, it’s the way to go for a space-time efficient solution for a large input.
Far easier to use would be my jm script, which is intended precisely to achieve the kind of objective you describe. In particular, please note its —-limit option. E.g. you could start with:
jm -s —-limit 1
See
https://github.com/pkoppstein/jm
How to read a 100+GB file with jq without running out of memory

Given that weird object with identical keys, you can use the --stream option to access all items before the JSON processor would eliminate the duplicates, fromstream and truncate_stream to dissect the input, and limit to reduce the output to just a few items:
jq --stream -cn 'limit(5; fromstream(2|truncate_stream(inputs)))' dump.json
{<row1>}
{<row2>}
{<row3>}
{<row4>}
{<row5>}

Related

How to change the definition of a command to refer to a different action

I want to create a new action called washing, as follows:
Understand "wash [something] with [something]" as washing.
Understand the command "clean" as "wash".
However, the Inform7 standard rules define a number of synonyms for rub, one of which is clean:
Understand "rub [something]" as rubbing.
Understand the commands "shine", "polish", "sweep", "clean", "dust", "wipe" and "scrub" as "rub".
The result is that I get a compiler error:
Problem. You wrote 'Understand the command "clean" as "wash"': but 'understand the command ... as ...' is only allowed when the new command has no meaning already, so for instance 'understand "drop" as "throw"' is not allowed because "drop" already has a meaning.
How can I tell Inform to switch the meaning of the clean command from rub to wash without affecting the rest of the rub definition at all?
From section 17.3 of the manual, you use as something new:
Understand the command "clean" as something new.
This will remove the word from the dictionary, but leave the old synonyms in place, i.e., it won't affect the rubbing action. And afterwards you can then define it as a synonym for your new washing action.
However, consider that many of the rubbing verbs could apply to washing. Instead of reassociating the "clean" verb, you could just redirect the rubbing action to the washing action for certain objects:
Instead of rubbing the dishes:
try cleaning the dishes.
Then you'd get all the verbs. Some of them might not make full sense, but they're unlikely to be used by the player, and the parser accepting extra commands that don't quite make sense is nowhere near as big a problem as the parser rejecting commands that do make sense.

Xquery optimization

I have this xquery as follows:
declare variable $i := doc()/some-element/modifier[empty(modifier-value)];
$i[1]/../..;
I need to run this query on Marklogic's Qconsole where we have 721170811 records. Since that is huge number of record, I am getting timeout error. Is there any way I can optimize this query to get the result?
P.S. I cannot request amdin to increase the timeout time.
Try creating an element range index (or a path range index if the target element is not unique) and using a cts:values() lexicon lookup.
That way, the request can read the values from the range index instead of having to read each document.
See:
http://docs.marklogic.com/guide/search-dev/lexicon
You could use xdmp:spawn, create a library when you will make the query, get the documents, iterate the result collecting 1000 documents per iteration and call another xdmp:spawn to process the information from that dataset, I would suggest summarize the result to return only the information you will need to don't crash the browser, at the end should look something like this:
xdmp:spawn("process.xqy")
into the library process.xqy
function local:start-process(){
let $docs := (....)
let $temp := for $x in $docs[$start to $end]
return local:process-dataset($temp) (: Could use spawn here too if you want :)
return xdmp:spawn("collect.xqy",$temp)
}
local:start-process()
compact-data function should create a file or a set of files with your data, this way the server will run all the process and in some minutes you will be available to see your data without problems.
You don't want to run something like doc() or xdmp:directory - just returns a result set that will kill you every time. You need to lower your result set by a lot.
A few thoughts:
You want to have as much done in MarkLogic's d-node, and the least work done in the e-node as possible. This is a way over-generalization, but for the most part I look at it like d-node stuff is data, indexes, lexicon work, etc. e-node stuff handles xQuery and such. So, in your example, you're definitely working out the e-node more than you need to.
You're going to want to use cts:search, as it uses indexes, not xPath to resolve your query. So, something like this:
declare variable $i := cts:search(fn:collection(),
cts:element-query(xs:QName("some-element"),
cts:element-value-query(xs:QName("modifier"), "", "exact")
)
)[1];
This will return document-node's, which it looks like what you were wanting with the $i[1]/../... This searches the xPath some-element for a modifier that is empty.
Please create element range index and attribute range index and use cts:search if you are familiar with marklogic it will be easy for you to write the query.

How to dynamically search/replace text with update in XQuery (exist-db)

My intention is to somehow clean source files automatically. How to do that in XQuery? (I am not interested in reconstructing the document in memory and storing it as a new one.) It is quite easy to do something similar in case of short and simple elements addressed directly, however, I can’t figure out how to do that dynamically for all the text nodes, if possible.
I would expect something like this could work:
update replace $div[contains(., 'chapter')] with replace(., 'chapter', 'Chapter')
This throws err:XPDY0002 Undefined context sequence for 'self::node()' [source: String]
Apparently, there is a problem in addressing the context with . in the replacing function. But maybe I don’t understand the update thing in general. I am only inspired by the bottom of this article.
Expression to the right of with is independent from expression to the left. So an explicit node/context is needed on both part :
update replace $div[contains(., 'chapter')] with replace($div, 'chapter', 'Chapter')

avoiding XDMP-EXPNTREECACHEFULL and loading document

I am using marklogic 4 and I have some 15000 documents (each of around 10 KB). I want to load the entire content as a document ( and convert the total documents to a single csv file and output to HTTP output stream for downloading). While I load the documents this way:
let $uri := cts:uri-match('products/documents/*.xml')
let $doc := fn:doc ($uri)
The xpath has some 15000 xmls. So fn:doc throws an error XDMP-EXPNTREECACHEFULL.
Is there any workaround for this? I cannot increase tree cache size in admin console because the number of xml files in products/documents/*.xml may increase.
Thanks.
When you want to export large quantities of XML from MarkLogic, the best technique is to write the query so that results can stream, avoiding the expanded tree cache entirely. It is a very different style of coding, though: you'll have to avoid strong typing of any kind, and refactor your code to remove FLWOR expressions. You won't be able to test any of the code in cq or qconsole, either.
Take a look at http://blakeley.com/blogofile/2012/03/19/let-free-style-and-streaming/ for some tips on how to get there. At a minimum the code sample you posted would have to become:
doc(cts:uri-match('products/documents/*.xml'))
In passing I would try to rework that to avoid the *.xml part, because it will be slower than needed. Maybe something like this?
cts:search(
collection(),
cts:directory-query('products/documents/', 'infinity'))
If you need to test for something more than the directory, you could add a cts:and-query with some cts:element-query test.
For general information about this error, see the MarkLogic knowledge base article on XDMP-EXPNTREECACHEFULL

Is it possible to declare a default associative array in Smarty?

In Smarty, I know you can declare a string:
{$somevar|default:'some string'}
or even an array:
{$somevar|default:array('someval')}
How do you/Is it possible to set an associative array as a default value? as this doesn't seem to work:
{$somevar|default:array('default'=>array('subkey'=>'subval'))}
I just tried:
{$somevar|default:array('key'=>'val')}
It's the '=>' smarty doesn't like
I know its probably not the solution you're looking for, but you can always just use the {php} feature. However, I will try a few things and see if I can work the format out.
Just out of interest, why are you trying to do this in the tpl file and not in the calling PHP script?
Edit
From takeing a read, it doesn't seem like it is possible. However, there is a "set" plugin which allows it, see here (bottom example).

Resources