Why Not Always Use EnableCrossPartitionQuery - azure-cosmosdb

If my cosmos DB has multiple partitions is there any reason to NOT set EnableCrossPartitionQuery to true?
I know it is necessary if running a query that could hit multiple partitions. But what if the query uses a valid partition key and definitely will only hit one partition, is there any performance loss or increased cost because I set that flag to true?

But what if the query uses a valid partition key and definitely will
only hit one partition, is there any performance loss or increased
cost because I set that flag to true?
Per my knowledge, you need set the partition key for partitioned collection and the cost will not change even if you still set the EnableCrossPartitionQuery as true.Because the request only scans the specific partition you already set. I did a sample test and try to verify it.
FeedOptions feedOptions = new FeedOptions();
PartitionKey partitionKey = new PartitionKey("A");
feedOptions.setPartitionKey(partitionKey);
feedOptions.setEnableCrossPartitionQuery(true);
FeedResponse<Document> queryResults = client.queryDocuments(
"/dbs/db/colls/part",
"SELECT * FROM c",
feedOptions);
System.out.println("Running SQL query...");
for (Document document : queryResults.getQueryIterable()) {
System.out.println(String.format("\tRead %s", document));
}
System.out.println(queryResults.getRequestCharge());
I think maybe you don't have to struggle with this problem. EnableCrossPartitionQuery option only need to be used if the query for partitioned collection is not scoped to single partition key value. If you know the specific partition key,then no need to set EnableCrossPartitionQuery.

Related

Cosmos DB read a single document without partition key

A container has a function called ReadItemAsync. The problem is I do not have the partition key, but only the id of the document. What is the best approach to get just a single item then?
Do I have to get it from a collection? Like:
var allItemsQuery = VesselContainer.GetItemQueryIterator<CoachVessel>("SELECT * FROM c where c.id=....");
var q = VesselContainer.GetItemLinqQueryable<CoachVessel>();
var iterator = q.ToFeedIterator();
var result = new List<CoachVessel>();
while (iterator.HasMoreResults)
{
foreach (var item in await iterator.ReadNextAsync())
{
result.Add(item);
}
}
Posting as answer.
Yes you have to do a fan out query but id is only distinct per partition key so even then you may end up with multiple items. Frankly speaking, if you don't have the partition key for a point read then the model for the database is not correct. It (or the application itself) should be redesigned.
Additionally. For small, single partition collections this x-partition query will not be too expensive as the collection is small. However, once the database starts to scale out this will get increasingly slower and more expensive as the query will fan out to ever increasing numbers of physical partitions. As stated above, I would strongly recommend you modify the app to pass the partition key value in the request. This will allow you to do a single point read operation which is extremely fast and efficient.
Good luck.
try using ReadItemAsync like:
dynamic log = await container.ReadItemAsync(ID, PartitionKey.None);

How to use the Partition Key in CosmosBD via SDK or via Select QUERY

Consider Below is my sample json.
{
"servletname": "cofaxEmail",
"servlet-class": "org.cofax.cds.EmailServlet",
"init-param": {
"mailHost": "mail1",
"mailHostOverride": "mail2"
}
i have chosen "servletname" as my primary key as i will receive it in every request plus few 1000 server names are there it could be the best PK.
My Question is, to make the partition key work for me.
Do i have to specify the partition key option seperately like below
ItemResponse<ServerDto> ServerDtoResponse = await this.container.ReadItemAsync<ServerDto>(bocServerDto.mailHost, new PartitionKey(bocServerDto.servletname));
or
Including the partition key in the select query itself , without adding seperate new PartitionKey(), like
select * from r where r.servletname='cofaxEmail' and r.mailHost='mail1';
Crux of the question is: By passing partitionKey object in where condition of select query is it enough to utilize the partition key feature?
Thanks
For any crud operation you would pass in the value for the partition key. For example, on a point read.
ItemResponse<ServerDto> ServerDtoResponse = await this.container.ReadItemAsync<ServerDto>(bocServerDto.mailHost, new PartitionKey("cofaxEmail"));
For a query, you can either pass it in the queryRequest options or use it in the query as the first filter predicate. Here is an example of using the queryRequest options.
thanks.

query cosmos db without enabling cross partition query

When query cosmos db, there is an option of setting enableCrossPartitionQuery as true.
I am wondering what happens that if I did not set it? Which partition will be used for the query?
thanks
If your collection is partitioned, then the query,update, delete opeartions need partition key setting.
If you don't set, perhaps you could see below error:
For this situation, if you don't want to set any partition key or you don't know which partition the row data belongs to, then you could set enableCrossPartitionQuery= true to avoid the error. If you set enableCrossPartitionQuery= true, it means this request will scan all the partitions to filter the data. Of course,it's query performance is bound to decline.
BTW,if your data size is small,i think the impact may be small. However,if the data size is large, i suggest you trying your best to avoid setting this property.
I tested the sample project : https://github.com/Azure-Samples/azure-cosmos-db-sql-api-nodejs-getting-started.git and it doesn't require partition key indeed when the container is partitioned.
However, based on the statements in the cosmos db rest api :
I tested java sdk and it requires the partition key when i query partitioned container. Anyway,i want to say that if you met the error which indicates the lack of partition key, you could try to add the property enableCrossPartitionQuery = true to solve it. Mostly, i still suggest you providing partition key for the query performance.

In Azure Cosmos DB, can we change partition key later on once we decided at the beginning

I am new to Cosmos DB and I noticed that we can set the partition key based on needs to scale effectively through code like this:
DocumentCollection myCollection = new DocumentCollection();
myCollection.Id = "coll";
myCollection.PartitionKey.Paths.Add("/deviceId");
Question is can we change the partition key later on after we created the collection and specified the partition key? As I may find out that the choice of partition key is not proper later.
Changing the partition key is not supported (see e.g. https://learn.microsoft.com/en-us/rest/api/cosmos-db/replace-a-collection). You would need to create a new collection.

Is partition key needed in queries even though JSON is indexed

I'm planning on using Cosmos Db (Document Db) and I'm trying to understand how the queries, indexing and partitions relate to each other.
How to partition and scale in Azure Cosmos Db talks about the partition key and other documentation indicates that partition key + id = unique id for the document. But then SQL Query and SQL syntax in Azure Cosmos Db says it provides automatic indexing of JSON documents without requiring explicit schema or creation of secondary indexes.
I understand that partition key is important for scalability and how data is stored. But if we think about searching is the partition key kind of like extra filter/where clause? All the documents are indexed so I can execute query like:
SELECT *
FROM Families
WHERE Families.address.state = "NY"
Should I still specify the partition key or indicate some how that cross partition queries are allowed when using this SQL query syntax?
Your first link gives the answer for this:
For partitioned collections, you can use PartitionKey to run the query against a single partition (though Cosmos DB can automatically extract this from the query text), and EnableCrossPartitionQuery to run queries that may need to be run against multiple partitions.
So, yes, you either need to specify the WHERE clause which will make query run against a single partition, or set EnableCrossPartitionQuery to true in query options.
You don't have to do that anymore, EnableCrossPartitionQuery is set to true by default nowadays. This means Cosmos won't complain if you don't skip the partition key in your query.
More info here.
You don't need to specify a partition key to the query. Recent version enabled cross partition queries by default

Resources