Does rocksdb support basic key concepts? - rocksdb

Rocks Db:
Hi, I am working on rocks db use case. Below are some questions i am trying to understand , any help would be appreciated:
How does primary and partition key defined in rocks database?
Does rocks db support indexing?

There is no primary key or partition key defined. It is an on disk sorted map. Where keys and values are byte arrays that's all. Any semantics required more are to be implemented by the app layer
No - you have to do that yourself using the write batch atomic update
And yes it is free to use - no license required

Related

Maintain unique value for DynamoDB partition key

I'm new to "DynamoDB" and wanting to know best practice to maintaining unique partition key value when you add records to a table.
With my existing experience related to SQL, primary keys are normally maintained by the system with identity columns or via a trigger. I've searched through various forums and "AWS" documentation, but did not find any specifics. Do you manually determine the existence of partition key value or am I missing something obvious?
In DynamoDB the querying is flexibility is limited when compared to SQL. So the schema as well as partition key / sort key should be designed to make the most common and important queries as fast as possible. You can find some generic best practices here
https://docs.aws.amazon.com/amazondynamodb/latest/developerguide/best-practices.html
https://aws.amazon.com/blogs/database/choosing-the-right-dynamodb-partition-key/
If you can provide better context on the use case that you are trying to use DynamoDB, you should get more pointed answere

How well does Teradata deal with Foreign Keys?

I'm starting a new project and one of the requirements is to use Teradata. I'm proficient in many different database systems but Teradata is fairly new to me.
On the client end they have removed all foreign keys from their database under the recommendations of "a consultant".
Every part of me cringes.
I'm using a new database instance so I'm not constrained by what they've already done on other databases. I haven't been explicitly told not to use foreign keys and my relation with the customer is such that they will at the very least hear me out. However, my decision and case should be well-informed.
Is there any intrinsic, technological reason that I should not use FKs in Teradata to maintain referential integrity based upon Teradata's design, performance, side-effects, etc...
Of note, I'm accessing Teradata using the .Net Data Provider v16 which only supports up to EF5.
Assuming that the new project is implementing a Data Warehouse there's a simple reason (and this is true for any DWH, not only Teradata): a DWH is not the same as an OLTP system.
Of course you still got Primary & Foreign Keys in the Logical data model, but maybe not implemented in the Physical model (although they are supported by Teradata). There are several reasons:
Data is usually loaded in batches into a DWH and both PK & FKs must be validated by the loading process before Insert/Update/Delete. Otherwise you load 1,000,000 rows and there's a single row failing the constraints. Now you got a Rollback and an error message and try to find the bad data, good luck. But when all the validation is already done during load there's no reason to do the same checks a 2nd time within the database.
Some tables in the DWH will be Slowly Changing Dimensions and there's no way to define a PK/FK on that usibg Standard SQL syntay, you need something like TableA.column references TableB.column and TableA.Timestamp between TableB.ValidFrom and TableB.ValidTo (it is possible when you create Temporal Table)
Sometimes a table is recreated or reloaded from scratch, hard to do if there's a FK referencing it.
Some PKs are never used for any access/join, so why implementing them physically, it's just a huge overhead in CPU/IO/storage.
Knowledge about PK/FK is important for the optimizer, so there's a so-called Soft Foreign Key (REFERENCES WITH NO CHECK OPTION), which is a kind of dummy: applied during optimization, but never actually checked by the DBMS (it's like telling the optimizer trust me, it's correct).

Is there a way to enforce a schema constraint on an AWS DynamoDB table?

I'm a MSSQL developer who recently was tasked with building a new application using DynamoDB since we use AWS and we wanted a highly scaleable database service.
My biggest concern is data integrity. For example, I have a table for all my users where every row needs to have a username, email, and name field, all strings, with a verified field that's an int. Is there anyway to require all entries in that table to have those fields and to be of that particular type?
Since the application is in PHP I'm using Kettle as my ORM which should prevent me from messing up the data integrity but another developer voiced a concern about if we ever add another application or if someone manually changes some types via the console.
https://github.com/inouet/kettle
Currently, no, you are responsible for maintaining the integrity of your items with respect to the existence of attributes that are not keys on the base table. However, you can use LSI and GSI to enforce data types of attributes (notwithstanding my qualm that this is not a recommended pattern, as it could cause partition heat especially for attributes whose range of values is small). For example, verified seems like it might take only 0 or 1 as a value, so if you create a GSI with PK=verified where verified is a Number, writes to the base table may get throttled by the verified GSI.

What kind of data storage uses Sparkfun's phant?

What kind of data storage uses Sparkfun's phant? Do they use a kind of a key-value or relational data store in background or a document-based one?
Okay, I guess i found an answer for this now on my own. According to www.prokarma.com's blog they use
a repository for storing key value pairs being pushed from devices
connected to the internet.

DynamoDB secondary sort

I'm assessing whether if I can use DynamoDB for our next project, what we are building is quite similar to a blogging platform, here is a simple table
Blog Post
ID - primary hash key
Title
DateCreated - primary range key
Votes
I've read enough to know how to List - list of blog posts, Paging - using last fetched index, Get post details - get a row, I will be sorting using DateCreate, which is my range key.
I'm struggling on how do do sort on a secondary index. For example, if we have a column called Votes, how do you do Most Votes? My interpretation is that you can only sort using the range index which I'm already using.
Update
AWS has just announced general availability of the much anticipated Global Secondary Indexes for Amazon DynamoDB, which are addressing the limitations of Local Secondary Indexes discussed further below:
You can now create indexes and perform lookups using attributes other than the item's primary key. [...]
You can now create up to five Global Secondary Indexes when you create a table, each referencing either a hash key or a hash key and a range key. You can also create up to five Local Secondary Indexes, and you can choose to project some or all of the table's attributes into each of the table’s indexes.
Please refer to the blog post for more details on the choice between these two models.
Correction
As rightly pointed out by vartec, I've been getting ahead of myself adding this information at the day Local Secondary Indexes had been announced without properly analyzing the problem at hand, where those are in fact not applicable - ironically I've stressed just that myself in a later comment on another question:
[...] however, please note that local is a crucial limitation: A local secondary index is a data structure that maintains an alternate range key for a given hash key - while this covers many real world scenarios, it doesn't apply to arbitrary non primary key field queries like those of the question at hand.
Thanks vartec for spotting this error and apologies for being misleading here.
Initial (erroneous) answer
Amazon DynamoDB has just announced Support for Local Secondary Indexes to address your use case:
[...] We call the newest capability Local
Secondary Indexes (LSI). While DynamoDB already allows you to perform
low-latency queries based on your table’s primary key, even at
tremendous scale, LSI will now give you the ability to perform fast
queries against other attributes (or columns) in your table. This
gives you the ability to perform richer queries while still meeting
the low-latency demands of responsive, scalable applications.
See also the introductory blog post Local Secondary Indexes for Amazon DynamoDB for a more detailed explanation.
As usual for AWS, the new functionality is released with a constrained feature set at first, which is going to be expanded over time:
Today, local secondary indexes must be defined at the time you create
your DynamoDB tables. In the future, we plan to provide you with an
ability to add or drop LSI for existing tables. If you want to equip
an existing DynamoDB table to local secondary indexes immediately, you
can export the data from your existing table using Elastic Map Reduce,
and import it to a new table with LSI. [emphasis mine]
looks like this isn't possible, you can only sort by the range hashkey
I'm going to load up the table in memory and sort it in memory.

Resources