I am working on a repository pattern for Azure Cosmos DB to learn this DB and get aquainted with it. Here is what I use for adding a range of items :
public async Task<IList<ItemResponse<T>>?> AddRange<T>(IDictionary<PartitionKey, T> items)
{
if (items == null || !items.Any())
return null;
var responses = new List<ItemResponse<T>>();
foreach (var item in items)
{
responses.Add(await _container.CreateItemAsync(item.Value, item.Key));
}
return responses;
}
Is it possible to do this inside a transactional batch ? I gave it a go and wanted to add multiple items of same type T in the transaction. The transaction batch uses a shared partition key. But I got Bad request as response when I tried it out. Does Azure Cosmos DB only support adding multiple items in a shared transaction because items collide in the same transaction due to this partition key being used ?
It should perhaps be possible to add multiple items at once of same type T but I ended up implementing it like this. The downside of this solution is multiple roundtrips to the database when adding multiple items.
Related
I have a code in to get all data in one collection in cosmo db, if the collection is empty then start to inserting.
The first time the code is run collection is empty however, SetIterator.HasMoreResults return True, even though the collection is empty.
Then there will be error raising Microsoft.Azure.Cosmos.CosmosException : Response status code does not indicate success: NotFound, and I have checked with a collection that is not empty the code run fine.
I can use try catch to handle it, but it does not seems to be a nice solution, Does anyone know how to check if the collection is empty?
var itemList = new List<T>();
using (FeedIterator<T> setIterator = _container.GetItemLinqQueryable<T>()
.ToFeedIterator())
{
while (setIterator.HasMoreResults)
{
foreach (var item in await setIterator.ReadNextAsync())
{
itemList.Add(item);
}
}
}
return itemList;
You can use indexing to get empty collection in cosmos db. To do this you can use unique indexes. Unique indexes are created to only those which doesn’t contain any documents.
Important
Unique indexes can be created only when the collection is empty
(contains no documents).
To create unique indexes on collections you can go through unique indexes
How can I get items count for a particular partition key using .net core preferably using Object Persistence Interface or Document Interfaces?
Since I do not see any docs any where, currently I get the number of items count by retrieve all the item and get its count, but it is very expensive to do the reads.
What is the best practices for such item count request? Thank you.
dynamodb is mostly a document oriented key-value db; so its not optimized for functionality of the common relation db functions (like item count).
to minimize the data that is transmitted and to improve speed you may want to do the following:
Create Lambda Function that returns Item Count
To avoid transmitting data outside of AWS; which is slow and expensive.
query options
use only keys in your projection-expression,
reducing the data that is transmitted from db
max page-size, reducing number of calls needed
Stream Option
Streams could also be used for keeping counts; e.g. as described in
https://medium.com/signiant-engineering/real-time-aggregation-with-dynamodb-streams-f93547cfb244
https://docs.aws.amazon.com/amazondynamodb/latest/developerguide/bp-gsi-aggregation.html
Related SO Question
Complexity of finding total records count with partition key in nosql dynamodb table?
I just realized that using low level interface in QueryRequest one can set Select = "COUNT" then when calling QueryAsync() orQuery() will return the count only as a integer only. Please refer to code sample below.
private static QueryRequest getStockRecordCountQueryRequest(string tickerSymbol, string prefix)
{
string partitionName = ":v_PartitionKeyName";
string sortKeyPrefix = ":v_sortKeyPrefix";
var request = new QueryRequest
{
TableName = Constants.TableName,
ReturnConsumedCapacity = ReturnConsumedCapacity.TOTAL,
Select = "COUNT",
KeyConditionExpression = $"{Constants.PartitionKeyName} = {partitionName} and begins_with({Constants.SortKeyName},{sortKeyPrefix})",
ExpressionAttributeValues = new Dictionary<string, AttributeValue>
{
{ $"{partitionName}", new AttributeValue {
S = tickerSymbol
}},
{ $"{sortKeyPrefix}", new AttributeValue {
S = prefix
}}
},
// Optional parameter.
ConsistentRead = false,
ExclusiveStartKey = null,
};
return request;
}
but I would like to point out that this still will consumed the same read units as retrieving all the item and get its count by yourself. but since it is only returning the count as an integer, it is a lot more efficient then transmitting the entire items list cross the wire.
I think using DynamoDB Streams in a more proper way to get the counts for large project. It is just a lot more complicated to implement.
Imagine the following function which is querying a GlobalSecondaryIndex and associated Range Key in order to find a limited number of results:
#Override
public List<Statement> getAllStatementsOlderThan(String userId, String startingDate, int limit) {
if(StringUtils.isNullOrEmpty(startingDate)) {
startingDate = UTC.now().toString();
}
LOG.info("Attempting to find all Statements older than ({})", startingDate);
Map<String, AttributeValue> eav = Maps.newHashMap();
eav.put(":userId", new AttributeValue().withS(userId));
eav.put(":receivedDate", new AttributeValue().withS(startingDate));
DynamoDBQueryExpression<Statement> queryExpression = new DynamoDBQueryExpression<Statement>()
.withKeyConditionExpression("userId = :userId and receivedDate < :receivedDate").withExpressionAttributeValues(eav)
.withIndexName("userId-index")
.withConsistentRead(false);
if(limit > 0) {
queryExpression.setLimit(limit);
}
List<Statement> statementResults = mapper.query(Statement.class, queryExpression);
LOG.info("Successfully retrieved ({}) values", statementResults.size());
return statementResults;
}
List<Statement> results = statementRepository.getAllStatementsOlderThan(userId, UTC.now().toString(), 5);
assertThat(results.size()).isEqualTo(5); // NEVER passes
The limit isn't respected whenever I query against the database. I always get back all results that match my search criteria so if I set the startingDate to now then I get every item in the database since they're all older than now.
You should use queryPage function instead of query.
From DynamoDBQueryExpression.setLimit documentation:
Sets the maximum number of items to retrieve in each service request
to DynamoDB.
Note that when calling DynamoDBMapper.query, multiple
requests are made to DynamoDB if needed to retrieve the entire result
set. Setting this will limit the number of items retrieved by each
request, NOT the total number of results that will be retrieved. Use
DynamoDBMapper.queryPage to retrieve a single page of items from
DynamoDB.
As they've rightly answered the setLimit or withLimit functions limit the number of records fetched only in each particular request and internally multiple requests take place to fetch the results.
If you want to limit the number of records fetched in all the requests then you might want to use "Scan".
Example for the same can be found here
I wrote the following code for one of my controller actions in an ASP.NET Core application and I want to know how many times the database was called. There are two LINQ statements, one that fetches the data and another that sorts the children. How does one confirm the number of database calls? I'm using the localdb that gets created by default in a Code First .NET Core application.
public async Task<IActionResult> Edit(int? id)
{
if (id == null)
{
return NotFound();
}
var giftCard = await _context.GiftCards.Include(g=>g.Transactions).SingleOrDefaultAsync(m => m.Id == id);
if (giftCard == null)
{
return NotFound();
}
giftCard.Transactions = giftCard.Transactions.OrderBy(t => t.TransactionDate).ToList();
return View(giftCard);
}
Linq allows developers to build a query, which is executed once the full query is built. You can build a query using multiple expressions without making a single call to the database. Linq will delay the call until the last possible moment.
In your case, the statement if (giftCard == null) explicitly asks for the object. Therefore, a database call will be made to fetch the requested GiftCard. Secondly, The OrderBy operation will be performed on the transactions that are already in memory (due to Include(g=>g.Transactions)). So, no database calls here.
Ultimately, your whole code will make only a single database call.
In RavenDB I can store objects of type Products and Categories and they will automatically be located in different collections. This is fine.
But what if I have 2 logically completely different types of products but they use the same class? Or instead of 2 I could have a generic number of different types of products. Would it then be possible to tell Raven to split the product documents up in collections, lets say based on a string property available on the Product class?
Thankyou in advance.
EDIT:
I Have created and registered the following StoreListener that changes the collection for the documents to be stored on runtime. This results in the documents correctly being stored in different collections and thus making a nice, logically grouping of the documents.
public class DynamicCollectionDefinerStoreListener : IDocumentStoreListener
{
public bool BeforeStore(string key, object entityInstance, RavenJObject metadata)
{
var entity = entityInstance as EntityData;
if(entity == null)
throw new Exception("Cannot handle object of type " + EntityInstance.GetType());
metadata["Raven-Entity-Name"] = RavenJToken.FromObject(entity.TypeId);
return true;
}
public void AfterStore(string key, object entityInstance, RavenJObject metadata)
{
}
}
However, it seems I have to adjust my queries too in order to be able to get the objects back. My typical query of mine used to look like this:
session => session.Query<EntityData>().Where(e => e.TypeId == typeId)
With the 'typeId' being the name of the new raven collections (and the name of the entity type saved as a seperate field on the EntityData-object too).
How would I go about quering back my objects? I can't find the spot where I can define my collection at runtime prioring to executing my query.
Do I have to execute some raw lucene queries? Or can I maybe implement a query listener?
EDIT:
I found a way of storing, querying and deleting objects using dynamically defined collections, but I'm not sure this is the right way to do it:
Document store listener:
(I use the class defined above)
Method resolving index names:
private string GetIndexName(string typeId)
{
return "dynamic/" + typeId;
}
Store/Query/Delete:
// Storing
session.Store(entity);
// Query
var someResults = session.Query<EntityData>(GetIndexName(entity.TypeId)).Where(e => e.EntityId == entity.EntityId)
var someMoreResults = session.Advanced.LuceneQuery<EntityData>(GetIndexName(entityTypeId)).Where("TypeId:Colors AND Range.Basic.ColorCode:Yellow)
// Deleting
var loadedEntity = session.Query<EntityData>(GetIndexName(entity.TypeId)).Where(e =>
e.EntityId == entity.EntityId).SingleOrDefault();
if (loadedEntity != null)
{
session.Delete<EntityData>(loadedEntity);
}
I have the feeling its getting a little dirty, but is this the way to store/query/delete when specifying the collection names runtime? Or do I trap myself this way?
Stephan,
You can provide the logic for deciding on the collection name using:
store.Conventions.FindTypeTagName
This is handled statically, using the generic type.
If you want to make that decision at runtime, you can provide it using a DocumentStoreListner