I am using Azure Cosmos to store customer data in my multi-tenant app. One of my customers started complaining about long wait times when querying their data. As a quick fix, I created a dedicated Cosmos instance for them and copied their data to that dedicated instance. Now I have two databases that contain exact copies of this customer's data. We'll call these databases db1 and db2. db1 contains all data for all customers, including this customer in question. db2 contains only data for this customer in question. Also, for both databases the partition key is an aggregate of tenant id and date, called ownerTime. Also, each database contains a single container named "call".
I then run this query in both databases:
select c.id,c.callTime,c.direction,c.action,c.result,c.duration,c.hasR,c.hasV,c.callersIndexed,c.callers,c.files,c.tags_s,c.ownerTime
from c
where
c.ownerTime = '352897067_202011'
and c.callTime>='2020-11-01T00:00:00'
and c.callTime<='2020-11-30T59:59:59'
and (CONTAINS(c.phoneNums_s, '7941521523'))
As you can see, I am isolating one partition (ownerTime: 352897067_202011). In this partition, there are about 50,000 records in each database.
In db1 (the database with all customer data), this uses 5116.38 RUs. In db2 (the dedicated instance), this query uses 65.8 RUs.
Why is there this discrepancy? The data in these two partitions is exactly the same across the two databases. The indexing policy is exactly the same as well. I suspect that db1 is trying to do a fan-out query. But why would it do that? I have the query set up so that it will only look in this one partition.
Here are the stats I retrieved for the above query after running on each database:
db1
Request Charge: 5116.38 RUs
Retrieved document count: 8
Retrieved document size: 18168 bytes
Output document count: 7
Output document size: 11793 bytes
Index hit document count: 7
Index lookup time: 5521.42 ms
Document load time: 7.8100000000000005 ms
Query engine execution time: 0.23 ms
System function execution time: 0.01 ms
User defined function execution time: 0 ms
Document write time: 0.07 ms
Round Trips: 1
db2
Request Charge: 65.8 RUs
Showing Results: 1 - 7
Retrieved document count: 7
Retrieved document size: 16585 bytes
Output document count: 7
Output document size: 11744 bytes
Index hit document count: 7
Index lookup time: 20.720000000000002 ms
Document load time: 4.8099 ms
Query engine execution time: 0.2001 ms
System function execution time: 0.01 ms
User defined function execution time: 0 ms
Document write time: 0.05 ms
Round Trips: 1
The indexing policy for both databases is:
{
"indexingMode": "consistent",
"automatic": true,
"includedPaths": [
{
"path": "/*"
}
],
"excludedPaths": [
{
"path": "/\"_etag\"/?"
},
{
"path": "/callers/*"
},
{
"path": "/files/*"
}
]
}
*Update: I recently updated this question with a clearer query, and the query stats returned.
Following up from comments.
Since db1 has all customer’s data, the physical partition will have lot more unique values for callTime so the number of index pages scanned to evaluate callTime will be high. In case of db2, since only 1 customer data is there the logical and physical partition will be the same. So while this is not a fan-out, the query engine will still need to evaluate the range filter on callTime for all other other customer data.
To fix/improve the performance on db1 you should create a composite index on /ownerTime and /callTime, see below.
"compositeIndexes":[
[
{
"path":"/ownerTime"
},
{
"path":"/callTime"
}
]
],
Thanks.
Related
I am new to Cosmos DB and facing issue in designing my DB.
I have a data similar to below structure
{
"userId": "64_CHAR_ID",
"gpId": "34_CHAR_ID"
... Other data
}
Currently my DB is having partition on userId as all the queries were by userId so far. Now I want to query my DB based on gpId when userId is not known. So it is ending up as cross partition query and it takes lot of clock-time(more than 5 min) and RUs (more than 3k RUs).
Query I am using is
SELECT * FROM c WHERE c.gpId='SOME_GPID'
According to Microsoft Doc we should avoid cross partition queries when dataset is large, and in my case the dataset is quite large (~80 GB).
So what would be a better design / strategy to query the data by gpId in cosmos db. My requirement is to query by gpId in almost real time.
Note: Current limit of RUs is set to 500000 RUs/s and also put on AutoScale.
I'm new to Azure Cosmos DB and I have this new project where I decided to give it a go.
My DB has only one collection where around 6,000 new items are added everyday and each looks like this
{
"Result": "Pass",
"date": "23-Sep-2021",
"id": "user1#example.com"
}
The date is partition key and it will be the date of which the item was added to the collection where the same id can be added again everyday as follows
{
"Result": "Fail",
"date": "24-Sep-2021",
"id": "user1#example.com"
}
The application that uses this DB will query by id and date to retrieve the Result.
I read some Azure Cosmos DB documentations and found that selecting the partition key very carefully can improve the performance of the database and RUs used for each request.
I tried running this query and it consumed 2.9 RUs where the collection has about 23,000 items.
SELECT * FROM c
WHERE c.id = 'user1#example.com' AND c.date = '24-Sep-2021'
Here are my questions
Is using date a good partition key for my scenario? Any rooms for improvements?
Will consumed RUs per request increase over time if number of items in collection increase?
Thanks.
For a write-heavy workload using date as a partition key is a bad choice because you will always have a hot partition on the current date. However, if the amount of data being written is consistent and the write volume is low, then it can be used and you will have good distribution of data on storage.
In read-heavy scenarios, date can be a good partition key if it is used to answer most of the queries in the app.
The value for id must be unique per partition key value so for your data model to work you can only have one "id" value per day.
If this is the case for your app then you can make one additional optimization and replace the query you have with a point read, ReadItemAsync(). This takes the partition key value and the id. This is the fastest and most efficient way to read data because it does not go through the query engine and reads directly from the backend data store. All point reads for 1KB of data or less will always cost 1RU/s.
I have a collection in Azure Cosmos DB with iot messages (called DeviceEvents). The partition key is application id. I want to do a query by device id (each device belongs to exactly one application). So I have a query like this
SELECT VALUE root
FROM root
WHERE root["ApplicationId"] = 69 AND root["DeviceId"] = 2978
AND root["TimeStamp"] >= "2021-01-30T20:30:05.1635579Z"
AND root["TimeStamp"] <= "2021-02-19T20:30:05.1635969Z"
ORDER BY root["TimeStamp"] DESC OFFSET 0 LIMIT 30
When I execute the query like this I get Request Charge 10.96 RUs, Index lookup time
2.22 ms, Document load time 0.41 ms and Query engine execution time 0.24 ms
When I execute the query without the partition key
SELECT VALUE root
FROM root
WHERE root["DeviceId"] = 2978
AND root["TimeStamp"] >= "2021-01-30T20:30:05.1635579Z"
AND root["TimeStamp"] <= "2021-02-19T20:30:05.1635969Z"
ORDER BY root["TimeStamp"] DESC OFFSET 0 LIMIT 30
When I execute the query like this I get Request Charge 10.45 RUs, Index lookup time
1.91 ms, Document load time 0.5 ms and Query engine execution time 0.24 ms
While the numbers vary the query with the partition key consistently consumes more RU and has higher index lookup time.
I don't have enough data for Cosmos DB to create different physical partitions right now but I will probably need it in the future. My relevant indexing policy is this
"compositeIndexes": [
[
{
"path": "/DeviceId",
"order": "ascending"
},
{
"path": "/TimeStamp",
"order": "descending"
}
]
So my questions are
Do I need the partition key in the query?
Do I need the partition key in the index definition?
The reason you're getting confusing query stats is because the amount data is too small to provide meaningful results.
With a small amount of data (approx 20GB or less) you'll only be on a single physical partition. Cross-partition queries run just as fast as partitioned queries when on the same physical partition.
Where things start to blow up is when the database grows (scales). If you design your database to have a high number of cross-partition queries your database, by design, will not scale. So you definitely need (or should try as much as possible) to use the partition key in your queries, especially high volume queries.
I would also add TimeStamp in both an ascending and descending composite index.
The other thing you mentioned is every device belongs to the same applicationId. If that is the case then your container cannot grow larger than 20GB. If every device in this app has applicationId of 69 then you should redesign this container and find a new partition key. If your queries are always by device Id then that would make a much better partition key.
Microsoft makes it clear that cross-partition queries "fan-out" the query to each partition (link):
The following query doesn't have a filter on the partition key (DeviceId). Therefore, it must fan-out to all physical partitions where it is run against each partition's index:
So I am curious if that "fan-out" can be optimized by doing a range query on a partition key, such as STARTSWITH.
To test it, I created a small Cosmos DB with seven documents:
{
"partitionKey": "prefix1:",
"id": "item1a"
},
{
"partitionKey": "prefix1:",
"id": "item1b"
},
{
"partitionKey": "prefix1:",
"id": "item1c"
},
{
"partitionKey": "prefix1X:",
"id": "item1d"
},
{
"partitionKey": "prefix2:",
"id": "item2a"
},
{
"partitionKey": "prefix2:",
"id": "item2b"
},
{
"partitionKey": "prefix3:",
"id": "item3a"
}
It has the default indexing policy with partition key "/partitionKey". Then I ran a bunch of queries:
SELECT * FROM c WHERE STARTSWITH(c.partitionKey, 'prefix1')
-- Actual Request Charge: 2.92 RUs
SELECT * FROM c WHERE c.partitionKey = 'prefix1:' OR c.partitionKey = 'prefix1X:'
-- Actual Request Charge: 3.02 RUs
SELECT * FROM c WHERE STARTSWITH(c.partitionKey, 'prefix1:')
SELECT * FROM c WHERE c.partitionKey = 'prefix1:'
-- Each Query Has Actual Request Charge: 2.89 RUs
SELECT * FROM c WHERE STARTSWITH(c.partitionKey, 'prefix2')
SELECT * FROM c WHERE c.partitionKey = 'prefix2:'
-- Each Query Has Actual Request Charge: 2.86 RUs
SELECT * FROM c WHERE STARTSWITH(c.partitionKey, 'prefix3')
SELECT * FROM c WHERE c.partitionKey = 'prefix3:'
-- Each Query Has Actual Request Charge: 2.83 RUs
SELECT * FROM c WHERE c.partitionKey = 'prefix2:' OR c.partitionKey = 'prefix3:'
-- Actual Request Charge: 2.99 RUs
The request charges were consistent when re-running the queries. And the pattern of charge growth seemed consistent with the result set and query complexity, with exception of maybe the 'OR' queries. However, then I tried this:
SELECT * FROM c
-- Actual Request Charge: 2.35 RUs
And the basic fan-out to all partitions is even faster than targeting a specific partition, even with an equality operator. I don't understand how this can be.
All this being said, my sample database is extremely small with only seven documents. The query set is probably not big enough to trust the results.
So, if I had millions of documents, would STARTSWITH(c.partitionKey, 'prefix') be more optimized than fanning out to all partitions?
I was trying to determine if there is any benefit of this approach myself, and according to the answers it does not seem like there is.
I did just learn about the new hierarchical partition keys feature that is in private preview and seems to address the problem we are trying to solve:
https://devblogs.microsoft.com/cosmosdb/hierarchical-partition-keys-private-preview/
Hierarchical partition keys are now available in private preview for
the Azure Cosmos DB Core (SQL) API. With hierarchical partition keys,
also known as sub-partitioning, you can now natively partition your
container with up to three levels of partition keys. This enables more
optimal partitioning strategies for multi-tenant scenarios or
workloads that would otherwise use synthetic partition keys. Instead
of having to choose a single partition key – which often leads to
performance trade-offs – you can now use up to three keys to further
sub-partition your data, enabling more optimal data distribution and
higher scale.
Since this allows up to 3 keys it could solve the problem by breaking up the prefixes into separate keys or at least further optimize it if there are more then 3.
Example
(Usage example from link):
https://github.com/AzureCosmosDB/HierarchicalPartitionKeysFeedbackGroup#net-v3-sdk-2
// Get the full partition key path
var id = "0a70accf-ec5d-4c2b-99a7-af6e2ea33d3d";
var fullPartitionkeyPath = new PartitionKeyBuilder()
.Add("Contoso") //TenantId
.Add("Alice") //UserId
.Build();
var itemResponse = await containerSubpartitionByTenantId_UserId.ReadItemAsync<dynamic>(id, fullPartitionkeyPath);
Considerations
From the preview link it looks like you would need to opt in to the preview and create a new container
New containers only – all keys must be specified upon container
creation
As you scale, you get fewer "logical partitions" per "physical partition", until eventually each partition key value has its own physical partition.
So:
if I had millions of documents, would STARTSWITH(c.partitionKey, 'prefix') be more optimized than fanning out to all partitions?
Both queries would fan-out across multiple partitions.
And I'm pretty sure that since "Azure Cosmos DB uses hash-based partitioning to spread logical partitions across physical partitions", there's no locality between partition keys with a common prefix, and each STARTSWITH query will have to fan-out across all the physical partitions.
The docs suggest that there is some efficiency
With Azure Cosmos DB, typically queries perform in the following order from fastest/most efficient to slower/less efficient.
GET on a single partition key and item key
Query with a filter clause on a single partition key
Query without an equality or range filter clause on any property
Query without filters
I am storing millions of documents in cosmos db with proper partitionkey. I need to retrieve say 500,000 documents to do some calculations and display the output in UI , this should happen with in say 10 second.
Would this be possible? I have tried this but taking nearly a minute. So for this kind of requirement is this the correct approach?
"id": "Latest_100_Sku1_1496188800",
"PartitionKey": "Latest_100_Sku1
"SnapshotType": 2,
"AccountCode": "100",
"SkuCode": "Sku1",
"Date": "2017-05-31T00:00:00",
"DateEpoch": 1496188800,
"Body": "rVNBa4MwFP4v72xHElxbvYkbo4dBwXaX0UOw6ZRFIyaBFfG/7zlT0EkPrYUcku+9fO/7kvca"
Size of one document : 825 byte
Am using autoscale 4000 Throughput
Query statistics - am using 2 queries.
Query 1 - select * from c where c.id in ({ids})
here i use PartitionKey in Query options.
Query Statistics
METRIC
VALUE
Request Charge
102.11 RUs
Showing Results
1 - 100
Retrieved document count More information
200
Retrieved document size More information
221672 bytes
Output document count More information
200
Output document size More information
221972 bytes
Index hit document count More information
200
Index lookup time More information
17.0499 ms
Document load time More information
1.59 ms
Query engine execution time More information
0.3401 ms
System function execution time More information
0.060000000000000005 ms
User defined function execution time More information
0 ms
Document write time More information
0.16 ms
Round Trips
1
Query 2 --
select * from c where c.PartitionKey in ({keys}) and c.DateEpoch>={startDate.ToEpoch()} and c.DateEpoch<={endDate.ToEpoch()}
Query Statistics
METRIC
VALUE
Request Charge
226.32 RUs
Showing Results
1 - 100
Retrieved document count More information
200
Retrieved document size More information
176580 bytes
Output document count More information
200
Output document size More information
176880 bytes
Index hit document count More information
200
Index lookup time More information
88.31 ms
Document load time More information
4.2399000000000004 ms
Query engine execution time More information
0.4701 ms
System function execution time More information
0.060000000000000005 ms
User defined function execution time More information
0 ms
Document write time More information
0.19 ms
Round Trips
1
Query #1 looks fine. Query #2 most likely would benefit from a composite index on DateEpoch. I'm not sure what the UDF is but if you're converting dates to epoch you want to read a new blog post New date and time system functions in Azure Cosmos DB
Overall, retrieving 500K documents in 1-2 queries to do some calculations seems like a strange use case. Typically most people will pre-calculate values and persist them using a materialized view pattern using change feed. Depending on how often you run these two queries, this is often a more efficient use of compute resources.