Partial query failure for kql query size - azure-data-explorer

I'm getting this goofy error which looks to be related to query size. My working query takes around 3 mins. If I add another few lines (which should take ~3 secs if I have results from previous query), kusto throws partial query error due to query size.
I didn't realize that there was such a limit in https://learn.microsoft.com/en-us/azure/data-explorer/kusto/concepts/resulttruncation. I've removed comments/tweaked some queries to spit out lesser fields but still running into same error.
Any idea about what's going on here?
Here's the error trace

This error may be coming in a scenario of cross-cluster query. In this scenario Kusto (Azure Data Explorer) will generate a query and send to another cluster. The query can become long if it contains in() operator (and arguments of in() are coming from the query partial evaluation).
If you provide more details - of what is your scenario / query you're running - you can get more detailed answer.

Related

Kusto query cancelled due to appended alias statements

I wrote a Kusto query which runs in under 1 minute however the same query gets cancelled for a colleague due to a 10x longer runtime. I collected some metadata about both query runs using .show queries and found that he had the same ClientRequestProperties with even less cache misses however there were 400+ alias database statements appended to the start of his query text. Is there some connection or user settings that could be causing these alias statements to be appended?
The alias statements themselves (400+) should not have a negative impact, however, it can be that these statements point to other clusters, in that case, the query may turn to be a query that is sent to many clusters and is not the same as the first query.
The next steps should be to ensure that the semantics of the two queries that you compare are identical, if they are, then please open a support ticket so that the team can investigate.
If you need help evaluating the semantics of the queries, please edit your question and provide a sample, simplified, and anonymized query that will allow better analysis of the issue.

Azure Cosmos DB aggregation and indexes

I'm trying to use Cosmos DB and I'm having some trouble making a simple count in a collection.
My collection schema is below and I have 80.000 documents in this collection.
{
"_id" : ObjectId("5aca8ea670ed86102488d39d"),
"UserID" : "5ac161d742092040783a4ee1",
"ReferenceID" : 87396,
"ReferenceDate" : ISODate("2018-04-08T21:50:30.167Z"),
"ElapsedTime" : 1694,
"CreatedDate" : ISODate("2018-04-08T21:50:30.168Z")
}
If I run this command below to count all documents in collection, I have the result so quickly:
db.Tests.count()
But when I run this same command but to a specific user, I've got a message "Request rate is large".
db.Tests.find({UserID:"5ac161d742092040783a4ee1"}).count()
In the Cosmos DB documentation I found this cenario and the suggestion is increase RU. Currently I have 400 RU/s, when I increase to 10.000 RU/s I'm capable to run the command with no errors but in 5 seconds.
I already tryed to create index explicity, but it seems Cosmos DB doesn't use the index to make count.
I do not think it is reasonable to have to pay 10,000 RU / s for a simple count in a collection with approximately 100,000 documents, although it takes about 5 seconds.
Count by filter queries ARE using indexes if they are available.
If you try count by filter on a not indexed column the query would not time out, but fail. Try it. You should get error along the lines of:
{"Errors":["An invalid query has been specified with filters against path(s) excluded from indexing. Consider adding allow scan header in the request."]}
So definitely add a suitable index on UserID.
If you don't have index coverage and don't get the above error then you probably have set the enableScanInQuery flag. This is almost always a bad idea, and full scan would not scale. Meaning - it would consume increasingly large amounts of RU as your dataset grows. So make sure it is off and index instead.
When you DO have index on the selected column your query should run. You can verify that index is actually being used by sending the x-ms-documentdb-populatequerymetrics header. Which should return you confirmation with indexLookupTimeInMs and indexUtilizationRatio field. Example output:
"totalExecutionTimeInMs=8.44;queryCompileTimeInMs=8.01;queryLogicalPlanBuildTimeInMs=0.04;queryPhysicalPlanBuildTimeInMs=0.06;queryOptimizationTimeInMs=0.00;VMExecutionTimeInMs=0.14;indexLookupTimeInMs=0.11;documentLoadTimeInMs=0.00;systemFunctionExecuteTimeInMs=0.00;userFunctionExecuteTimeInMs=0.00;retrievedDocumentCount=0;retrievedDocumentSize=0;outputDocumentCount=1;outputDocumentSize=0;writeOutputTimeInMs=0.01;indexUtilizationRatio=0.00"
It also provides you some insight where the effort has gone if you feel like RU charge is too large.
If index lookup time itself is too high, consider if you index is selective enough and if the index settings are suitable. Look at your UserId values and distribution and adjust the index accordingly.
Another wild guess to consider is to check if the API you are using would defer executing find(..) until it knows that count() is really what you are after. It is unclear which API you are using. If it turns out it is fetching all matching documents to client side before doing the counting then that would explain unexpectedly high RU cost, especially if there are large amount of matching documents or large documents involved. Check the API documentation.
I also suggest executing the same query directly in Azure Portal to compare the RU cost and verify if the issue is client-related or not.
I think it just doesn't work.
The index seems to be used when selecting the documents to be counted, but then the count is done by reading each document, so effectively consuming a lot of RU.
This query is cheap and fast:
db.Tests.count({ UserID: { '$eq': '5ac161d742092040783a4ee1' }})
but this one is slow and expensive:
db.Tests.count({ ReferenceID: { '$gt': 10 }})
even though this query is fast:
db.Tests.find({ ReferenceID: { '$gt': 10 }}).sort({ ReferenceID: 1 })
I also found this: https://feedback.azure.com/forums/263030-azure-cosmos-db/suggestions/36142468-make-count-aware-of-indexes. Note the status: "We have started work on this feature. Will update here when this becomes generally available."
Pretty disappointing to be honest, especially since this limitation hasn't been addressed for almost 2 years. Note - I am not an expert in this matter and I'd love to be proven wrong, since I also need this feature.
BTW: I noticed that simple indexes seem to be created automatically for each individual field, so no need to create them manually.

ExecuteNextAsync Not Working

I am working with Azure DocumentDB. I am looking at the ExecuteNextAsync operation. What I am seeing is the the ExecuteNextAsync returns no resluts. I am using examples I have found on line and don't generate any results. If I call an enumeration operation on the initial query results are returned. Is there an example showing the complete configuration for using ExecuteNextAsync?
Update
To be more explicit I am not actually getting any results. The call seems to just run and no error is generated.
Playing around with the collection defintion, I found that when I set the collection size to 250GB that this occurred. I tested with the collection to 10GB and it did work, for a while. Latest testing shows that the operation is now hanging again.
I have two collections generated. The first collection appears to work properly. The second one appears to fail on this operation.
Individual calls to ExecuteNextAsync may return 0 results, but when you run the query to completion by calling it until HasMoreResults is false, you will always get the complete results.
Almost always, a single call to ExecuteNextAsync will return results, but you may get 0 results commonly due to two reasons:
If the query is a scan, then DocumentDB will make partial progress based on available throughput. Here no results are returned, but a new continuation token based on the latest progress is returned to resume execution.
If it's a cross-partition query, then each call executes against a single partition. In this case, the call will return no results if that partition has no documents that match the query.
If you want queries to deterministically return results, you must use SELECT TOP vs. using the continuation token/ExecuteNextAsync as a mechanism for paging. You can also read query results in parallel across multiple partitions by changing FeedOptions.MaxDegreeOfParallelism to -1.

Lucene search return no records

I am facing a quite strange issue searching with Lucene. I have a query with 3 clauses. If I launch the query with just 2 clauses in share, it returns several documents, included the one I am seeking for. Nevertheless, if I add the third clause to the query and perform the query in share it returns no results, but it returns the document I am looking for when I launch it in the alfresco console!.
I guess it is not a grant issue since I get the document I am looking for when the query is less restrictive. The query with the third clause just fails for a specific value, for the others it works fine.
It maybe an indexing problem but in that case I think it should fail when launching the query in alfresco console as well.
Any help?
Querying in Alfresco Share differs from querying in the Nodebrowser or directly through JavaScript.
If you take a look at: alfresco/templates/webscripts/org/alfresco/slingshot/search/search.lib.js which is a repository webscript triggered by Share. You'll see in the code it fail saves results.
So you'll need to play around to get the right results.

oracle does full table scan but returns resutls so quickly

When I open up TOAD and do a select * from table,the results (first 500 rows) come back almost instantly. But the explain plan shows full table scan and the table is very huge.
How come the results are so quick?
In general, Oracle does not need to materialize the entire result set before it starts returning the data (there are, of course, cases where Oracle has to materialize the result set in order to sort it before it can start returning data). Assuming that your query doesn't require the entire result set to be materialized, Oracle will start returning the data to the client process whether that client process is TOAD or SQL*Plus or a JDBC application you wrote. When the client requests more data, Oracle will continue executing the query and return the next page of results. This allows TOAD to return the first 500 rows relatively quickly even if it would ultimately take many hours for Oracle to execute the entire query and to return the last row to the client.
Toad only returns the first 500 rows for performance, but if you were to run that query through an Oracle interface, JDBC for example, it would return the entire result. My best guess is that the explain plan shows you the results in the case it doesn't get a subset of the records; that's how i use it. I don't have a source for this other than my own experience with it.

Resources