ComsosDB index. Should I exclude it - azure-cosmosdb

In my SQL-CosmosDB I am not using any queries with WHERE condition other than by a partition key + sort by additional field (so a streamId which is a partition key and event position, as I use Cosmos to store my aggragate roots).
I wonder what will happen if I just exclude all paths from indexing in that collection, except maybe keeping the field I am using for sorting.

Alexander,according to you requirements,i think you could consider setting the index mode as None.Please refer to the explanations in this link.
If a container's indexing policy is set to None, indexing is
effectively disabled on that container. This is commonly used when a
container is used as a pure key-value store without the need for
secondary indexes. It can also help speeding up bulk insert
operations.
Of course,you could choose excluding the root path to selectively include paths that need to be indexed if you have special needs. BTW, as mentioned by #DraganB in the comments,change index policy only affects new records,you could see the statements in this link. So it's better to deliberate at the initial time.

Related

Wildcard searches

Our MarkLogic based web-application mostly uses cts.jsonPropertyValueQuery to access needed information.
We want to provide the possibility of wildcard searches against specific JSON properties.
What is the best way to do it?
Turning on one of the wildcard indexes for the whole database is not an option.
I figured out that adding a "wildcarded" parameter to the query itself may solve the problem:
cts.search(cts.jsonPropertyValueQuery("inventor", "R?th", ["wildcarded", "whitespace-sensitive"]));
But it may work slow due to the absence of indexes. Is there any way to create wildcard indexes only for that specific JSON property?
You could create a Path Field with an XPath to the inventor JSON field (and even for //inventor) and configure the field to have wildcard indexes, and then use a field query: cts.fieldValueQuery or cts.fieldWordQuery.

How to introduce a new column in dynamo DB running in production?

I have a use case where DynamoDB is running in production and I need to add a new column IDUpdatedAt which will also be serving as a sort key for one of the GSIs.
I tried a thing in test where my application adds the new rows with IDUpdatedAt, it's working fine but what about the existing rows? How to add the values for those?
Also the new rows will not be added without IDUpdatedAt, but how will the search be impacted for older rows?
PS: IDUpdatedAt is being used as a filter in the application, i.e., user can search for specific ID and can get results sorted by date. That's why IDUpdatedAt is also a part of GSI (sort key).
Please help.
You've got the right idea by adding the field to new items. After all, DynamoDB does not enforce a particular schema outside of the primary key.
This also happens to be a very useful feature, especially when defining a GSI on that attribute; if the atttibute exists on the item, it ends up in the index! For example, imagine modeling an email inbox in DDB where each item represents an email. You could include an attribute 'is_read' and define a GSI using that atttibute.
If the 'is_read' attribute exists on the item, it's in the index. Otherwise, it's not. A cool way to use GSIs to implement filtering.
Pretty neat stuff!
However, there is no way to retroactively update all items with a new attribute other than manually updating each item (or in batches). The equivalent in SQL databases is defining a new column. Unfortunately, an analogous operation in DDB does not exist.

Can I update a value of a key in a hash table?

I am new to using hash table, and I want to know how to change the value of a specific existing key in a hash table. I tried to search but all that came up involved hash map, which I am not familiar with, and which I am not planning to use.
I am not sure whether hash tables only enables inserting and removing values, or whether it enables making a change to an existing key.
Also, please explain to me how to do so. (i.e. .put() means insert. what do I do to change?)
thanks.
Edited because, on reflection, the wording of the question seemed ambiguous, and might have assumed the wrong meaning initially.
You can't change the key, if that's what you meant. The key determines the position of an entry in the hash map/table (by definition), so if you change the key without changing the position, the map/table is now corrupt.
To change the key and change its position is simple: remove the entry under the old key, and add the same entry under the new key.
You can change the value associated with the key. There are several possible approaches. One is to just use put() with the same key to update the value; see the documentation for this. Another is to use entrySet() to get the set of key,value mappings, find the entry for your key, and use setValue() on that entry.
Of course, remove and re-add will also allow you to change the value.

Using timestamp as an Attribute in DynamoDB

I'm quite new to DynamoDB, but have some experience in Cassandra. I'm trying to adapt a pattern I followed in Cassandra, where each column represented a timestamped event, and wondering if it will carry over gracefully into DynamoDB or if I need to change my approach.
My goal is to query a set of documents within a date range by using the milliseconds-since-epoch timestamp as an Attribute name. I'm successfully storing the following as each report is generated with each new report being added under its own column:
{ PartitionKey:customerId,
SortKey:reportName_yyyymm,
'#millis_1#':{'report':doc_1},
'#millis_2#':{'report':doc_2},
. . .
'#millis_n#':{'report':doc_n}
}
My question is, given a millisecond-based date range, and the accompanying Partition and Sort keys, is it possible to query the set of Attributes that fall within that range or must I retrieve all columns for the matching keys and filter them at the client?
Welcome to the most powerful NoSQL database ;)
To kick off with the positive news, there is no way to query out specific attributes. You can project certain attributes in a query. But you would have to write your own logic to determine which attributes or columns should be included in the projected query. To get close to your solution you could use a map attribute inside an item with the milliseconds as a key. But there is another thing you have to be aware of when starting on this path.
There is a maximum total item size of 400KB for each item in DynamoDB, including key and attribute names.(Limits in DynamoDB Items) This means you can only store so many attributes in an item. This is especially true if you intend to put the actual report inside of the attribute. Which I would advise against, also because you will be burning up read capacity units every time you get one attribute out of the whole item. You would be better of putting this data in a separate table with the keys in the map. But truthfully in DynamoDB I would split this whole thing up, just add the milliseconds to the sort key and make every document its own item. That way you can directly query to these items and you can use the "between" where clause to select specific date-time ranges. Please let me you meant something else.

DynamoDB Set order

From DynamoDB docs:
An attribute of type String Set. For example:
"SS": ["Giraffe", "Hippo" ,"Zebra"]
Type: Array of strings
Required: No
This is all I could find. I did some testing but that's clearly not enough for production environments and I would like to get a confirmation/confutation from people who have actually worked with these Sets.
Do DynamoDB Sets maintain insertion order? Can I count on that fact & build logic around that?
Im mainly interested in String Set but it probably applies to all of them (String, Number, Binary).
Here is the documentation. SET data type doesn't preserve the order.
SET : The order of the values within a set are not preserved;
therefore, your applications must not rely on any particular order of
elements within the set.
LIST - A list type attribute can store an ordered collection of values
Similar discussion on AWS forum

Resources