Is there any way to upsert in RIAK TS, like one can do in Mongodb?
Do I have to check if a key exists and if it does, manually overwrite it? Or can I assume that an insert will overwrite any existing key?
Although Riak TS is geared towards immutable data it is possible to update existing records. INSERT operations will overwrite existing keys. There is no upsert functionality.
Related
Is there some operation of the Scan API or the Query API that allows to perform a lookup on a table with a composite key (pk/sk) but that varies only in the pk to optimize the Scan operation of the table ?
Let me introduce a use case:
Suppose I have a partition key defined by the id of a project and within each project I have a huge amount of records (sk)
Now, I need to solve the query "return all projects". So I don't have a partition key and I have to perform a scan.
I know that I could create a GSI that solves this problem, but let's assume that this is not the case.
Is there any way to perform a scan that "hops" between each pk, ignoring the elements of the sk's?
In other words, I will collect the information of the first record of each partition key.
DynamoDB is a NoSQL database, as you already know. It is optimized for LOOKUP, and practices that you used to have in SQL databases or other (low-scale) databases are not always available in DynamoDB.
The concept of a partition key is to put records that are part of the same partition together and sorted by the sort key. The other side of it is that records that don't have the same partition key, are stored in other locations. It is not a long list (or tree) of records that you can scan over.
When you design your schema in a NoSQL database, you need to consider the access pattern to that data. If you need a list of all the projects, you need to maintain an index that will allow it.
New to DynamoDB, I have the partition group_id, and sort key groupid_storeid_sortk.
I am wanting to setup additional access pattern with the group_id and store_addrss_sortk.
Will this have any impact on performance using the partition key in the secondary index, or would it be better to create a new attribute as the secondary key, even though it would be duplicate data.
ThankYou
It’s fine to use the same partition key attribute again as the PK for the GSI. No problem there.
For the future: You may want to watch some videos on single-table design and start using PK/SK as generic names since you might want to overload what’s inside them for different items. And then you might want GSI1PK/GSI1SK as the GSI keys.
That’s a style thing when you aim for some optimizations single-table design can bring.
An index is simply another table that you don't have to manage yourself. When you create an index, the service (DynamoDB, for example) creates a new table for you and manages the synchronization of the data between the tables.
In DynamoDB you have two types of secondary indexes, Global and Local. If you use the same partition key, you can use both of these options. However, you have to define the secondary local index (SLI) when you create the table and you can't add it later. Only secondary global indexes (SGI) can be added after the creation of the table. You can read more about it in DyanmoDB documentation.
Regarding performance, you need to consider the cost (read/write capacity) on top of the usual time considerations. You need to see if you are writing a lot to the table and not only reading a lot. Based on that you can plan carefully the projection of the data into the new index. Remember that writes are about 10 times more expensive and slower than reads. You can read more about projection best practices here.
I wanted to prevent any document create/update in Cosmos db with empty or null PartitionKey. Is this possible from a db level?
AFAIK, it is not possible to enforce this constraint from the database level. You would need to implement this from the application side only.
If your collection is setup with a partition key, I don't see how you can update it using any SDK without specifying it because how would CosmosDB know in what partition to put it?
Of course if you create a null partition key that is a different story.
I have a collection with thousands of documents all of which have a synthetic partition key property like:
partitionKey: ‘some-document-related-value’
now i need to change values for partitionKey. of course, it takes recreation of documents in order to do so but i am wondering what is the most efficient/straightforward way to do it?
should i use azure function with cosmosdbtrigger? (set to start feed from begining)
change feed processor?
some other way?
i’m looking for quickest solution thats still reliable.
Yes, change feed is a common way to migrate data from one container to another. Another simple option may be to use Data Migration Tool where you build your new partition key in the select statement.
Hopefully this is helpful.
What's the recommended way of manually indexing a newly inserted record, and re-indexing a changed record, in OrientDB?
I know of the possibility of automatic indexes with OrientDB, but I need to map the indexed property to a lowercase version (for case insensitive search). So I will need to index records manually. Is it recommended/possible to make a database procedure to (re-)index a record?
In OrientDB you've both:
automatic indexes, updated by OrientDB (like RDBMS)
manual indexes, on your charge, it's like a persistent map
For more information look at the official guide: https://github.com/orientechnologies/orientdb/wiki/Indexes