Analyze a scenario performance? - asp.net

i want to design something like a dynamic form in which admin define each form fields.
i design 3 table: mainform table for shared properties, then formfield tables which have mainformID as a foreign key and define each form fields
e.g:
AutoID | FormID | FieldName
_____________________________
100 | Form1 | weight
101 | Form1 | height
102 | Form1 | color
103 | Form2 | Size
104 | Form2 | Type
....
at leas a formvalues table like bellow:
FormFieldID | Value | UniqueResponseID
___________________________________________
100 | 50px | 200
101 | 60px | 200
102 | Red | 200
100 | 30px | 201
101 | 20px | 201
102 | Black | 201
103 | 20x10 | 201
104 | Y | 201
....
for each form i have to join these 3 tables to catch all fields and values. i wonder if its the only way to design such a scenario? does it decrease sql performance? or is there any fast and better way?

This is a form of EAV, and I'm gonna assume you absolutely have to do it instead of the "static" design.
does it decrease sql performance?
Yes, getting a bunch of rows (under EAV) is always going to be slower than getting just one (under the static design).
or is there any fast and better way?
Not from the logical standpoint, but there are significant optimizations (for query performance at least) that can be done at the physical level. Specifically, you can carefully design your keys to minimize the I/O (by putting related data close together) and even eliminate the JOIN itself.
For example:
This model migrates keys through FOREIGN KEY hierarchy all the way down to the ATTRIBUTE_VALUE table. The resulting natural composite key in ATTRIBUTE_VALUE table enables us to:
Get all attributes1 of a given form by a single index range scan + table heap access on ATTRIBUTE_VALUE table, and without doing any JOINs at all. In addition to that, you can cluster2 it, eliminating the table heap access and leaving you with only the index range scan3.
If you need to only get the data for a specific response, change the order of the fields in the composite key, so the RESPONSE_ID is at the leading edge.
If you need both "by form" and "by response" queries, you'll need both indexes, at which point, I'd recommend secondary index to also cover4 the VALUE field.
For example:
-- Since we haven't used NONCLUSTERED clause, this is a B-tree
-- that covers all fields. Table heap doesn't exist.
CREATE TABLE ATTRIBUTE_VALUE (
FORM_ID INT,
ATTRIBUTE_NAME VARCHAR(50),
RESPONSE_ID INT,
VALUE VARCHAR(50),
PRIMARY KEY (FORM_ID, ATTRIBUTE_NAME, RESPONSE_ID)
-- FOREIGN KEYs omitted for brevity.
);
-- We have included VALUE, so this B-tree covers all fields as well.
CREATE UNIQUE INDEX ATTRIBUTE_VALUE_IE1 ON
ATTRIBUTE_VALUE (RESPONSE_ID, FORM_ID, ATTRIBUTE_NAME)
INCLUDE (VALUE);
1 Or a specific attribute, or a specific response for a specific attribute.
2 MS SQL Server clusters all tables by default, unless you specify NONCLUSTERED clause.
3 Friendliness to clustering and elimination of JOINs are some of the main strengths of natural keys (as opposed to surrogate keys). But they also make tables "fatter" and don't isolate from ON UPDATE CASCADE. I believe pros outweigh cons in this particular case. For more info on natural vs. surrogate keys, look here.
4 Fortunately, MS SQL Server supports including fields in index solely for covering purposes (as opposed to actually searching through the index). This makes the index leaner than a "normal" index on the same fields.

I like Branko's approach, and it is quite similar to metadata models i have created in the past, so this post is by way of extension to his. you may want to add a datatype table, which can work both for native types (int,varchar,bit,datetime etc.) and your own definitions (although i don't see the necessity off the cuff).
thence, Branko's "value" column becomes:
value_tinyint tinyint
value_int int
value_varchar varchar(xx)
etc.
with a datatype_id (probably tinyint) as a foreign key into the "mydatatype" table.
[excuse the lack of pretty ER diagrams like BD's]
mydatatype
datatype_id tinyint
code varchar(16)
description varchar(64) -- for reference purposes
This extension should:
a. save you a good deal of casting when reading or writing your data
b. allow both reads and writes with some easily constructed dynamic SQL
Furthermore (and maybe this is out of scope), you may want to store the order in which these objects are created/saved, as well as conditional display based on button push/checkbox/radio button selection etc.
I won't go into detail here, since i'm not sure you need these things, but if you do i'll check this every so often and respond with stuff.

Related

DynamoDB Global Secondary Index "Batch" Retrieval

I've see older posts around this but hoping to bring this topic up again. I have a table in DynamoDB that has a UUID for the primary key and I created a secondary global index (SGI) for a more business-friendly key. For example:
| account_id | email | first_name | last_name |
|------------ |---------------- |----------- |---------- |
| 4f9cb231... | linda#gmail.com | Linda | James |
| a0302e59... | bruce#gmail.com | Bruce | Thomas |
| 3e0c1dde... | harry#gmail.com | Harry | Styles |
If account_id is my primary key and email is my SGI, how do I query the table to get accounts with email in ('linda#gmail.com', 'harry#gmail.com')? I looked at the IN conditional expression but it doesn't appear to work with SGI. I'm using the go SDK v2 library but will take any guidance. Thanks.
Short answer, you can't.
DDB is designed to return a single item, via GetItem(), or a set of related items, via Query(). Related meaning that you're using a composite primary key (hash key & sort key) and the related items all have the same hash key (aka partition key).
Another way to think of it, you can't Query() a DDB Table/index. You can only Query() a specific partition in a table or index.
Scan() is the only operation that works across partitions in one shot. But scanning is very inefficient and costly since it reads the entire table every time.
You'll need to issue a GetItem() for every email you want returned.
Luckily, DDB now offers BatchGetItem() with will allow you to send multiple, up to 100, GetItem() requests in a single call. Saves a little bit of network time and automatically runs the requests in parallel; but otherwise is the little different from what your application could do itself directly with GetItem(). Make no mistake, BatchGetItem() is making individual GetItem() requests behind the scenes. In fact, the requests in a BatchGetItem() don't even have to be against the same tables/indexes. The cost for each request in a batch will be the same as if you'd used GetItem() directly.
One difference to make note of, BatchGetItem() can only return 16MB of data. So if your DDB items are large, you may not get as many returned as your requested.
For example, if you ask to retrieve 100 items, but each individual
item is 300 KB in size, the system returns 52 items (so as not to
exceed the 16 MB limit). It also returns an appropriate
UnprocessedKeys value so you can get the next page of results. If
desired, your application can include its own logic to assemble the
pages of results into one dataset.
Because you have a GSI with PK of email (from what I understand) you can use PartiQL command to get your batch of emails back. The API is called ExecuteStatment and you use a SQL like syntax:
SELECT * FROM mytable.myindex WHERE email IN ['email#email.com','email1#email.com']

How to create lines/stops relationship

I'm not a database expert and I'm simply building a prototype app, so nothing really important.
Anyway, the app is about a subway: this subway has many lines and sometimes some stops are shared between lines (so, for example, stops 3 and 4 are stops of lines 2, 7 and 9).
So, I made up a SQLite stops table:
+---------+-------------+------+
| Field | Type | Auto |
+---------+-------------+------+
| id | integer | YES |
| name | varchar(20) | NO |
| lines | ? | NO |
+---------+-------------+------+
What's the best way to deal with shared stops? My idea was to create a lines table and then in the lines field of the stops table put a comma separated list of lines.id. I don't know why, but I feel there could be a better way.
Any suggestion is appreciated, and sorry for the really noob question.
I would keep it simple and use a table lines which has an ID (primary key) along with other metadata for a line (such as name):
lines
(id, name)
Then, create a table for the stops:
stops
(id, name)
Finally, you can create a bridge table which will connect lines with stops:
bridge
(lineId, stopId)
Each record in the bridge table represents one line having a given stop.
Note that using CSV to represent a line having multiple stops is totally not the way to go here, as it renders the powers of your relational database useless.
Update:
If you want to record the position of a stop in a given line (and assuming that positions would differ across lines), you could use the following table:
stopNumbers
(lineId, stopId, stopPosition)
The stop position can be obtained knowing the line's ID and the stop's ID.
You need a many-to-many relation, which is stored in a separate table like this:
table lines_to_stops
line_fk
stop_fk
That's the relational world ...
Note that records in the database are not in any specific order. If you need to put the stops into any specific order (which you most probably do), you have to store this order to the database as well:
table lines_to_stops
line_fk
stop_fk
order_in_line

Indicating a "canonical" record in a one-to-many table

Imagine we have a table of countries, and a table of cities. A country can of course have many cities, but a city can only be in one country, thus a one-to-many relationship makes intuitive sense:
countries
| id | name |
| 1 | Lorwick |
| 2 | Belmead |
cities
| id | country | name |
| 1 | 1 | Marblecrest |
| 2 | 1 | Westacre |
| 3 | 2 | Belcoast |
| 4 | 1 | Rosemarsh |
| 5 | 2 | Vertston |
But in addition to our one-to-many relationship, we want to describe the one-to-one relationship of national capitals. If it matters, assume that capitals may change fairly regularly, and for that matter cities appear and vanish at will, and that cities may switch countries. Point is, this data is unstable.
I see a couple of options:
Add an int column capital to countries which cannot be null. Pro: always exactly one city; Con: not associated with the city, nothing enforcing the city is in the country, or that it even exists.
Add a boolean column capital to cities, which if true indicates the city is the capital of the associated country. Pro: directly associated with the city in question, no duplicate columns indicating hierarchy; Con: pretty sure this is poor normalization as there's nothing stopping there being zero, or more than one, "capital" for a given country.
Create an additional table capitals with columns country and city and a unique constraint on both columns (or at least on city). Pro: feels cleaner, easy joining on either countries or cities; Con: still doesn't ensure city is in country, or that either exist.
What is the most normalized and/or best way to represent this relationship? Is there any way to ensure each country has exactly one capital which does in fact exist and resides inside that country? I imagine it's not possible, in which case, how can I best minimize issues for my client code?
I'm currently using SQLite, but I'm interested in generalized answers, regardless of the underlying database.
I did a little digging and found Indicating primary/default record in database but I don't think this really answers my question.
PS: It's not that bad if there's no capital (there may be no cities!), but it would be bad if there were multiple.
I think the requirement "each country has exactly one capital" conflicts with the requirement "cities appear and vanish at will". If a city can vanish, it follows that a capital city can vanish, too.
You can enforce the constraint "each country has [zero or] one capital which does in fact exist and resides inside that country" with a foreign key constraint on a table of capitals.
create table capitals (
country_id integer primary key,
city_id integer not null,
foreign key (country_id, city_id) references cities (country_id, city_id)
);
In that table, the primary key constraint guarantees that there can be no more than one capital per country. The foreign key constraint guarantees that that the capital you choose exists in the country you choose. In the referenced table (the "cities" table), you also need a unique constraint on {city_id, country_id}; since {city_id} is unique in the "cities" table, {city_id, country_id} will necessarily be unique in that table, too, so that's not a problem.
The declarative "way" to guarantee a one-to-one relationship between countries and capitals (not a one-to-zero-or-one relationship) is to use an assertion. But I don't know of any current SQL dbms that supports CREATE ASSERTION. That forces us rely on one or more of these:
triggers and possibly deferred constraints,
application code, or
administrative procedures.
(Initially, you'd have to enter a row in the three tables "countries", "cities", and "captials" in a single transaction in order to satisfy all the constraints. I think you'll need deferred constraints for that, but I haven't had coffee yet today.)
For clarity and simplicity, I'd add the boolean IsCapital column to the cities table. Then add a trigger that sets all other cities (that share the updated record's country) IsCapital = false when IsCapital is set to true on a record. This will handle most of your concerns. The one case to ensure there is exactly one capital per country isn't really possible, you can ensure there is 0 or 1, but since the cities table has a FK constraint to countries, there is always going to be a point in time where inserted countries will have no cities that can be set as the capitol.
FWIW, I think logic should be left to the app, referential integrity to the database.
To make sure there is exactly one capital per country and the capital is not a city from a different country, do this:
Note how we use the identifying relationship to migrate the COUNTRY_ID to CITY's PK, so it can be migrated back to the CONTRY - this is what guarantees a capital must actually belong to the country it is the capital of.
The circular reference here prevents the insertion of new data, which is resolved using deferred foreign keys if the DBMS supports them. Otherwise, you can just leave COUNTRY.CAPITAL_NO NULL-able (and enforce its eventual non-NULL-ness at the application level).1
1 This assumes the DBMS has MATCH SIMPLE foreign keys (i.e. FK is ignored if any of its components are NULL). If the DBMS supports only MATCH PARTIAL or FULL (such as MS Access), you are out of luck, and would have to emulate the FK through non-declarative means (triggers or application code).

What's the best way to retrieve this data?

The architecture for this scenario is as follows:
I have a table of items and several tables of forms. Rather than having the forms own the items, the items own the forms. This is because one item can be on several forms (although only one of each type, but not necessarily on any). The forms and items are all tied together by a common OrderId. This can be represented like so:
OrderItems | Form A | Form B etc....
---------- |--------- |
ItemId |FormAId |
OrderId |OrderId |
FormAId |SomeField |
FormBId |OtherVar |
FormCId |etc...
This works just fine for these forms. However, there is another form, (say, FormX) which cannot have an OrderId because it consists of items from multiple orders. OrderItems does contain a column for FormXId as well, but I'm confused about the best way to get a list of the "FormX"s related to a single OrderId. I'm using MySQL and was thinking maybe a stored proc was the best way to go on this, but I've never used a stored proc on MySQL and don't really know the best way to go about it. My other (kludgy) option was to hit the DB twice, first to get all the items that are for the given OrderId that also have a FormXId, and then get all their FormXIds and do a dynamic SELECT statement where I do something like (pseudocode)
SELECT whatever FROM sometable WHERE FormXId=x OR FormXId=y....
Obviously this is less than ideal, but I can't really think of any other way... anything better I could do either programmatically or architecturally? My back-end code is ASP.NET.
Thanks so much!
UPDATE
In response to the request for more info:
Sample input:
OrderId = 1000
Sample output
FormXs:
-----------------
FormXId | FieldA | FieldB | etc
-------------------------------
1003 | value | value | ...
1020 | ... .. ..
1234 | .. . .. . . ...
You see the problem is that FormX doesn't have one single OrderId but is rather a collection of OrderIds. Sometimes multiple items from the same order are on FormX, sometimes it's just one, most orders don't have any items on FormX. But when someone pulls up their order, I need for all the FormXs their items belong on to show up so they can be modified/viewed.
I was thinking of maybe creating a stored proc that does what I said above, run one query to pull down all the related OrderIds and then another to return the appropriate FormXs. But there has to be a better way...
I understand you need to get a list of the "FormX"s related to a single OrderId. You say, that OrderItems does contain a column for FormXId.
You can issue the following query:
select
FormX.*
From
OrderItems
join
Formx
on
OrderItems.FormXId = FormX.FormXId
where
OrderItems.OrderId = #orderId
You need to pass #orderId to your query and you will get a record set with FormX records related to this order.
You can either package this query up as a stored procedure using #orderId paramter, or you can use dynamic sql and substitute #orderId with real order number you executing your query for.

ASP.NET and a One-to-Many-to-Many Scenario

I'm new to ASP.NET but not to programming. I am migrating our current site from PHP/MySQL to ASP.NET(3.5)/SqlServer. I've been lurking here since the site's launch, so I'm confident that one (or more) of you can help me out. Here's the scenario:
This is a training department site and the dept. has a course catalog stored in the table course. Each course may have many prerequisite courses, For example, A and B are prerequisites for C. I would normally store this either as a comma-delimited column in course or in a separate table course_prereq or course_course as a recursive relationship. This part I can do.
However, the business rules require that each course can have multiple sets of prerequisites. Fore example, N requires A, B and C, or N requires X and Y. This is where I'm stuck.
Previously, I stored this information in a column for row N as A,B,C|X,Y, parsed the ids into a PHP 2D-array, submitted a second query for all the rows whose id was in that array, then used PHP to separate those rows into their respective groups. Once all this processing is done, the groups of prerequisites are displayed as separate tables on the web page, like so:
| A | A's information goes here |
| B | B's information goes here |
| C | C's information goes here |
- - - - - - - OR - - - - - - - -
| X | X's information goes here |
| Y | Y's information goes here |
How would I accomplish this using ASP.NET?
Add a table to hold Prerequisite Sets. This table holds a set ID and key back to the courses table for each course in the set. The table may have several rows for a given set ID, so your primary key will be the set ID plus the course ID. Then in your course_prereq table you relate courses to the different prerequisite sets. An OR relationship can be assumed there because any ANDs are enforced in the sets themsevles.
Have a table called PrerequisiteSet that FKs to each prereq. Then have a Course_PrerequisiteSet many to many table that FKs to Course and PrerequisiteSet. Most of the time there will only be one entry in Course_PrerequistieSet, but if there are more than one, then it will be an OR relationship.
Both the answers above were very helpful. I ended up using just one database table instead of the suggested two. The table contains a course_id, prereq_id, and set_id, which all together form the primary key.
In the ASP.NET page, I use a repeater to loop over the sqldatasource stored procedure that returns a course's prerequisite sets, and a gridview inside that repeater that reads the individual prerequisite information from a second sqldatasource stored procedure. Like this:
RepeaterSqlDataSource (returns set ids)
Repeater
. . . GridViewSqlDataSource (returns course info for each prereq_id in set
. . . GridView
Hope this is helpful to anyone else looking at a similar scenario.

Resources