why link database tables? - asp.net

I have a DB model that uses several tables to hold business logic data; one of the tables includes a user table that hold the UserID. The code reads data from these tables and generates a json string that's used in the application. To avoid having to redo the queries and recreating the json string, I store the json string in a table call JsonCache that has only 2 columns: UserID and JsonCacheWork. Each user has multiple JsonCacheWork entries (entries add themselves as the result of user interactions).
For the moment, there are no relationships on my database diagram between the User table and the JsonCache table. It seems to work fine.
Why would I need/want to add relationships between the 2 tables and what are the advantags, if any, of keeping my DB diagram as is.
Thanks for the explanation.

You add the relationship between tables to tell the database management system that the records in those two tables are related. DMS can enforce this relationship, which means that you won't make a mistake of adding a record to JsonCache table with UserId that doesn't exist.
Also, the foreign key constraint can be used to automatically delete records in one table when a related record in the other table is deleted (cascade deletes). So when you delete a user the DMS will delete his entry in JsonCache for you.
Sql Server is a relational database and relations between tables are key part of database schema. They help maintaining data integrity.

You would have to have a join table between the two if there was a many-to-many relationship between the two entities. I would say that since user and the JSON are one-to-one, I'd question using two separate entities. I'd put the JSON in the user table and simply query for it by user ID. I don't see a reason for the separate table with two columns just for the JSON.
I might change my mind if the JSON had a shelf life: effective and expiration dates, history for a single user, etc.

Related

Can I create with DynamoDB multiple tables with secondary index concurrencly?

I am confused by the API documentation of CreateTable from DynamoDB. I need to create multiple tables with a secondary index. From the API: https://sdk.amazonaws.com/java/api/latest/software/amazon/awssdk/services/dynamodb/DynamoDbClient.html#createTable-software.amazon.awssdk.services.dynamodb.model.CreateTableRequest-
If you want to create multiple tables with secondary indexes on them, you must create the tables sequentially. Only one table with secondary indexes can be in the CREATING state at any given time.
and
Up to 500 simultaneous table operations are allowed per account. These operations include CreateTable, UpdateTable, DeleteTable, UpdateTimeToLive, RestoreTableFromBackup, and RestoreTableToPointInTime.
The only exception is when you are creating a table with one or more secondary indexes. You can have up to 250 such requests running at a time;
Can I create now only one table with a secondary index or 250 at the same time?
If I create multiple tables sequential without waiting on active state is this already concurrency creation?
Must I wait on the active state for every table if I create multiple tables with secondary indexes?
An individual account can only be running one "Create Index" action at a time, no matter how many tables you have.
To understand this it may help to understand what an Index is. An Index is a complete copy of the table, but with a different partition and sort key. So if your original table has a PK of of userId and a sk of sort_key you could now create an index where the partition key is set to sort_key and the sort_key is now set to userId creating an inverted index (a common practice in Dynamo - remember Queries in Dynamo must know what the PK is, so if you have UserID you could access all data of a given User, or if you wanted all Users who have a particular tag, you may have an SK item on users that is something like TAG#ThisTag and then you wanted all users with ThisTag you could do a query against the inverted index with a pk = TAG#ThisTag and get back a list of UserIds.)
While the CreateIndex is being run on a given table, no other actions can be run on it - it wont accept changes to the data/configuration that would cause a fault/mismatch in the copying process. This is one of the reasons a given account is limited to only one create index operation at a time.
As a slight aside if I may - if you have a single account with multiple Dynamos all for the same product, you may want to rethink your database strategy. A single Dynamo Table can be used for many different storages if you set up your PK-SK as generic fields (ie: pk and sk as the attribute names) - No document inside your dynamo has to have the same attributes as any other. And when accessing data, each partition key is exactly as its named - a Partition of data that is all that is accessed when a query is made against that PK. (so if you have 100 items with PK of USER#1 and 100 items with a PK of USER#2 and you query against USER#1 you only access that 100 items - the rest are ignored by the Query and never ever touched - allowing you to in effect have multiple "tables" in a single DynamoDB Table by giving them different Partition Key prefixes.)

How to filter DynamoDb by object property value

I have a DynamoDB table:
How shoul I filter entried in DB table where all keys are: access.role = "ADMIN"?
You would be best served by setting up an Global Index (GSI). You set the Partition Key equal to that attribute, and the Sort Key equal to some other attribute that you can guarantee will be unique. Then you use your SDK of choice or the Query option in the console, select the index, and query for partion_key = ADMIN
However. Be aware. Index's are a complete replication of the table. Dynamo is very good at this and relatively fast at doing so, but there is still the possibility that your index will be out of sync with the actual data. If you are not making the call against the index very often you are pretty much fine. If you are calling it very often, then you should restructure your table.
Dynamo is not an SQL. When setting up a dynamo schema you have to consider how you will access your data. your Access Patterns. You should design your data with your Partition Key as the data you will have when looking up (Ie: i always will have a user ID number) and your sort keys as the individual documents related to that PK (ie: a user has a document that is his profile data, a document that is his profile picture url, a document that is a list of his friends user numbers, a document that is ... ect)
Then you use Indexs for things like your question that you wont be doing very often.

How best to perform a query on primary partition key only, for a table which has both partition key and sort key?

Ok, I have a table with primary partition key (Employee ID) and Sort Key (Poject ID). Now I want a list of all projects an employee works on. Also I want list of all employees working on a project. The relationship is many to many. I have created schema in AppSync (GraphQL). Appsync created the required queries and mutations for the type (EmployeeProjects). Now the ListEmployeeProjects takes a filter input with different attributes. My question is when I do the two searches on Employee ID or Project ID only, will it be a complete table scan? How efficient will that be. If it is a table scan, can I reduce the time complexity by creating indexes (GSI or LSI). The end product will have huge amount of data, so I cannot test the app with such data before hand. My project works fine, but I am worried about the problems that might arise later on with a lot of data. Can someone please help.
You don't need to (and should not) perform a Scan for this.
To get all of the projects an employee is working on, you just need to perform a Query on the base table, specifying employee ID as the partition key.
To get all of the employees on a project, you should create a GSI on the table. The partition key should be project ID and sort key should be employee ID. Then perform a Query on the GSI, using partition key of project ID.
In order to model this correctly you will probably want three tables
Employee Table
Project Table
Employee-Project reference table (i.e. just two attributes of employee ID and project ID)

Can Sails.js attributes link to a collection via multiple columns?

I'm using Sails.js to build an API for an existing database. Unfortunately, modifying the structure of the database is not an option.
Many tables in the database have status columns of one type or another. They tend to have single-letter values that don't make sense without context. Context is provided by a "lookup" table in the database with 3 primary keys: table_name, column_name, and column_contents. Therefore, if I have a letter returned as a status, I can do a query against the lookup table and check a fourth column, description.
I'd love to configure my Sails.js models to understand all this, but it seems that one-to-many relationships can only be set up for tables with a single primary key. Is that correct?
Based on the "many-to-many" workaround, I assume the sails way to solve this would be to create new tables that are subsets of the "lookup" table (each for a single instance of table_name, column_name). Is there a better way?

insert data from a asp.net form to a sql database with foreign key constraints

i have two tables
asset employee
assetid-pk empid-pk
empid-fk
now, i have a form to populate the asset table but it cant because of the foreign key constraint..
what to do?
thx
Tk
Foreign keys are created for a good reason - to prevent orphan rows at a minimum. Create the corresponding parent and then use the appropriate value as the foreign key value on the child table.
You should think about this update as a series of SQL statements, not just one statement. You'll process the statements in order of dependency, see example.
Asset
PK AssetID
AssetName
FK EmployeeID
etc...
Employee
PK EmployeeID
EmployeeName
etc...
If you want to "add" a new asset, you'll first need to know which employee it will be assigned to. If it will be assigned to a new employee, you'll need to add them first.
Here is an example of adding a asset named 'BOOK' for a new employee named 'Zach'.
DECLARE #EmployeeFK AS INT;
INSERT (EmployeeName) VALUES ('Zach') INTO EMPLOYEE;
SELECT #EmployeeFK = ##IDENTITY;
INSERT (AssetName, EmployeeID) VALUES ('BOOK',#EmployeeFK) INTO ASSET;
The important thing to notice above, is that we grab the new identity (aka: EmployeeID) assigned to 'Zach', so we can use it when we add the new asset.
If I understand you correctly, are you trying to build the data graph locally before persisting to the data? That is, create the parent and child records within the application and persist it all at once?
There are a couple approaches to this. One approach people take is to use GUIDs as the unique identifiers for the data. That way you don't need to get the next ID from the database, you can just create the graph locally and persist the whole thing. There's been a debate on this approach between software and database for a long time, because while it makes a lot of sense in many ways (hit the database less often, maintain relationships before persisting, uniquely identify data across systems) it turns out to be a significant resource hit on the database.
Another approach is to use an ORM that will handle the persistence mapping for you. Something like NHibernate, for example. You would create your parent object and the child objects would just be properties on that. They wouldn't have any concept of foreign keys and IDs and such, they'd just be objects in code related by being set as properties on each other (such as a "blog post" object with a generic collection of "comment" objects, etc.). This graph would be handed off to the ORM which would use its knowledge of the mapping between the objects and the persistence to send it off to the database in the correct order, perhaps giving back the same object but with ID numbers populated.
Or is this not what you're asking? It's a little unclear, to be honest.

Resources