Comment system database design - asp.net

I'm developing a system like SO (completely different topic) and replies and comments are alike with the system we see everyday on StackOverflow.
My question is, I'm loading the question with a Stored PROC, loading replies with another Stored PROC and now I'm adding comment system. Do I need to fetch the comments 1 by 1 for each of the replies on topic?
This means that if I have my page size set to 20 replies, I'll be doing 22 database operations which is more than I was thinking.
I don't think I need to add my database diagram for this question but still here it is:
Questions
-----------
QUESTION_ID
USER_ID
QUESTION_TEXT
DATE
REPLIES
-----------
REPLY_ID
QUESTION_ID
USER_ID
REPLY_TEXT
DATE
COMMENTS
------------
REPLY_ID (fk replies)
USER_ID
TEXT
DATE

You should get all your comments at once.
Then make DataViews from the result with a filter for each reply and bind to that DataView. You could also use linq to entities and just filter out new sets on each bind. Here is a basic pseudo code example:
Get all comments for all replies to question
Bind replies
Implement the OnDataBinding for the reply control that will display the comments
In the OnDataBinding add a filter to the result set for the comments with the same reply ID
Bind the filtered list of comments to the display control for comments
This should work and I have implement the same scenario for similar types of data structures.

Pabuc,
For your initial Question, why not get all the results using a single Query for the given question / reply ?
select reply_text, user_id
from REPLIES
order by DATE asc
Also, as you pointed out, except for the minor differences, the question and answer have almost the same attributes as that of a post.
Wouldn't a model like the one below make more sense? The Question and Answer are both "posts" with the only difference being an answer has the question as the parent and the question has no parent.
Create table post -- question/reply (
post_id number,
parent_post_id number, -- will be null if it is the question, will have the question id
-- if it is a reply to a question
post_text varchar2(4000),
user_id number,
post_date date);
-self referential foreign key
Alter table post
add constraint foreign key (parent_post_id) references post(post_id);
--comments to all posts (questions/replies).
create table comments(
comment_id number,
post_id number,
comment_txt varchar2(140),
comment_user_id number,
comment_date date
);
alter table comments add constraint fk_comments_post
foreign key (post_id) references post(post_id).
-- for a given Question (post) id, you can get all the replies and posts using...
select replies.*,
comments.*
from posts replies,
comments
where replies.parent_id = :Question_id --input
and comments.post_id = replies.post_id
You might have to add an order by clause to get the results based on points, updated_timestamp or any other attribute as needed.

Related

For Each ,For First

What is the meaning of For each and For First.. Example below
FOR EACH <db> NO-LOCK,
FIRST <db> OF <db> NO-LOCK:
DISPLAY ..
Also why we need to use NO-LOCK for every table for every time.
Let's answer by giving an example based on the Progress demo DB:
FOR EACH Customer WHERE Customer.Country = "USA" NO-LOCK,
FIRST Salesrep WHERE Salesrep.salesrep = Customer.Saleserp:
/* your code block */
END.
The FOR EACH Block is an iterating block (loop) that integrates data access (and a few more features like error handling and frame scoping if you want to go that far back).
So the code in "your code block" is executed for every Customer record matching the criteria and it also fetches the matching Salesrep records. The join between Customer and Salesrep is an inner join. So you'll only be processing Customers where the Salesrep exists as well.
FOR statement documentation (includes EACH and FIRST keywords)
NO-LOCK documentation
Google is your friend and documentation on packages is usually quite user-friendly.
Try not to ask questions that can be solved by simple search on StackOverflow.
FOR EACH table
Selects a set of records and starts a block to process those records.
NO-LOCK means what it says, the records are retrieved from the database without any record locking. So you might get a "dirty read" (uncommitted data) and someone else might change the data while you are looking at that record.
That sounds awful but, in reality, NO-LOCK reads are almost always what you want to use. If you do need to update a NO-LOCK record you can just FIND CURRENT with a lock.
FOR EACH NO-LOCK can return large numbers of records in a single network message whereas the other lock types are one record at a time - this makes NO-LOCK quite a bit faster for many purposes. And even without the performance argument you probably don't want to be taking out large numbers of locks and preventing other users running inquiries all the time.
Your example lacks a WHERE clause so, by default, every record in the table is returned using the primary index. If you specify a WHERE clause you will potentially only have a subset of the data to loop through and the index selection may be impacted. You can also add a lot of other options like BY to specify sort order.
FOR FIRST is somewhat similar to FOR EACH except that you only return, at most, a single record. Even if the WHERE clause is empty or would otherwise specify a larger result set. BUT BE CAREFUL - the "FIRST" is deceptive. Even if you specify a sort order using BY the rule is "selection, then sorting". At most only one record gets selected so the BY doesn't matter. The index dictated by the WHERE (or lack of a WHERE) determines the sort order. So if your request something like:
FOR FIRST customer NO-LOCK BY discount:
DISPLAY custNum name discount.
END.
You will fetch customer #1, not customer #41 as you might have expected. (Try the code above with the sports2000 database. Replace FIRST with EACH in a second run.)
FOR EACH table1 NO-LOCK,
FIRST table2 NO-LOCK OF table1:
or
FOR EACH customer NO-LOCK,
FIRST salesRep NO-LOCK OF customer:
DISPLAY custnum name customer.salesRep.
END.
Is a join. The OF is a shortcut telling the compiler to find fields that the two tables have in common to build an implied WHERE clause from. This is one of those "makes a nice demo" features that you don't want to use in real code. It obfuscates the relationship between the tables and makes your code much harder to follow. Don't do that. Instead write out the complete WHERE clause. Perhaps like this:
for each customer no-lock,
first salesRep no-lock where sakesRep.salesRep = customer.salesRep:
display custnum name customer.salesRep.
end.

Last Function in Query

So I currently have a database that keeps tracks of projects, project updates, and the update dates. I have a form that with a subform that displays the project name and the most recent update made to said project. It was brought to my attention however, that the most recent update to a project does not display correctly. Ex: shows the update date of 4/6/2017 but the actual update text is from 3/16/2017.
Doing some spot research, I then learned that Access does not store records in any particular order, and that the Last function does not actually give you the last record.
I am currently scouring google to find a solution but to no avail as of yet and have turned here in hopes of a solution or idea. Thank you for any insight you can provide in advance!
Other details:
tblProjects has fields
ID
Owner
Category_ID
Project_Name
Description
Resolution_Date
Priority
Resolution_Category_ID
tblUpdates has these fields:
ID
Project_ID
Update_Date
Update
there is no built-in Last function that I am aware of in Access or VBA, where exactly are you seeing that used?
if your sub-form is bound directly to tblUpdates, then you ought to be able to just sort the sub-form in descending order based on either ID or Update_date.
if you have query joining the two tables, and are only trying to get a single row returned from tblUpdates, then this would do that, assuming the ID column in tblUpdates is an autonumber. if not, just replace ORDER BY ID with ORDER BY Update_Date Desc
SELECT a.*,
(SELECT TOP 1 Update FROM tblUpdates b WHERE a.ID = b.PROJECT_ID ORDER BY ID DESC ) AS last_update
FROM tblProjects AS a;

T-Sql ordering results with count() by priority

I am trying to build a system, which has threads and posts. I am trying to fetch a thread that is the most popular (the user can click "like" button to make it more popular) and has most posts. The problem is to order the results by most posts..and then by liked posts.
So for example, if I have a thread with 300 posts, and 200 likes.. while another thread has got 300 likes and 201 likes..I want the second post to be selected..
Table structure in a nutshell:
topic:
--------
topic_id
liked
comment:
-------
comment_id
topic_id
Here is my stored procedure so far:
dbo.Trends
AS
SELECT TOP 1 title, COUNT(com.topic_id), COUNT(topc.user_id_liked)
FROM comment AS com
INNER JOIN topic AS topc ON com.topic_id=topc.topic_id
GROUP BY com.topic_id, topc.user_id_liked,title
ORDER BY COUNT(com.topic_id), COUNT(topc.user_id_liked) DESC
I am not sure if I am right, or will I have to result to control flow logic. I placed the topic_id from the topic table before topic liked column in the order statement..hoping the selecting/ordering of the topic_id will take precendence.
UPDATED: query updated.
I don't really know that you want. But maybe this will help:
;WITH CTE
AS
(
SELECT
COUNT(com.topic_id) OVER(PARTITION BY topc.liked) AS topicCount,
COUNT(com.liked) OVER(PARTITION BY topc.topic_id) AS likedCount,
title
FROM
commnet AS com
INNER JOIN topic AS topc
ON com.topic_id=topc.topic_id
)
SELECT TOP 1
CTE.title,
CTE.topicCount,
CTE.likedCount
FROM
CTE
ORDER BY
topicCount,
likedCount
EDIT
The differences between the GROUP BY and PARTITION BY is that PARTITION BY is an inline GROUP BY so this will not affect the number of rows. I like to use that in a CTE that is a inline view. Makes it clearer and you separate the different steps you want to do. If you remove the TOP 1 you will see what I mean.

Best Practices for updating multiple check boxes on a web form to a database

A sample case scenario - I have a form with one question and multiple answers as checkboxes, so you can choose more than one. Table for storing answers is as below:
QuestionAnswers
(
UserID int,
QuestionID int,
AnswerID int
)
What is the best way of updating those answers to the database using a stored proc? At different jobs I've seen all spectrum, from simply deleting all previous answers and inserting new ones, to passing list of answers to remove and list of answers to add to the stored proc.
In my current project performance and scalability are pretty important, so I'm wondering what's the best way of doing it?
Thanks!
Andrey
If I had a choice of table design, and the following statements are true:
You know the maximum choices count per question/
Each choice is a simple checked/unchecked.
Each answer be classified as correct/wrong rather than marked by some scale. (Like 70% right.)
Then considering performance I would considered the following table instead of the one you presented:
QuestionAnswers
(
UserID int,
QuestionID int,
Choice1 bool,
Choice2 bool,
...
ChoiceMax bool
)
Yes, it is ugly in terms of normalization but that denormalization will buy performance and simplify queries -- just one update/insert for one question. (And I would update first and insert only if affected rows equals to zero.)
Also detecting whether the answer was correct will be also more simple -- with the following table:
QuestionCorrectAnswers
(
QuestionID int,
Choice1 bool,
Choice2 bool,
...
ChoiceMax bool
)
All you need to do is just to lookup for the row in QuestionCorrectAnswers with the same combination of choices as user answered.
If the questions are always the same, then you'd never delete anything - just run an update query on all changed Answers.
Update QuestionAnswers
SET AnswerID = #AnswerID
WHERE UserID = #UserID AND QuestionID = #QuestionID
If for some reason you still need to do some delete/insert - I'd check which QuestionIDs already exist (for the given UserID) so you do a minimum of Delete/Insert.
Updates are just far faster than Delete then Insert, plus you don't make any identity columns skyrocket.
I presume you load the QuestionAnswers from DB upon entering the page, so the user can see which answers he/she gave last time - if you did you already have the necessary data in memory, for determining what to delete, insert and update.
Andrey: If the user (userid=1) selects choices a(answerid=1) & b(answerid=2) for question 1(questionid=1) and later switches to c (a-id=3) & d(a-id=4), you would have to check, then delete the old and add the new. If you choose to follow the other approach, you would not be checking if a particular record exists (so that you can update it), you would just delete old records and insert new records. Anyways, since you are not storing any identity columns, I would go with the latter approach.
It is a simple solution:
Every [Answer] should have integer value (bit) and this value is unique for current Question.
For example, you have Question1 and four predefined answers:
[Answer] [Bit value]
answer1 0x00001
answer2 0x00002
answer3 0x00004
answer4 0x00008
...
So, you SQL INSERT/UPDATE will be:
declare #checkedMask int
set #checkedMask = 0x00009 -- answer 1 and answer 4 are checked
declare #questionId int
set #questionId = 1
-- delete
delete
--select r.*
r
from QuestionResult r
inner join QuestionAnswer a
on r.QuestionId = a.QuestionId and r.AnswerId = a.AnswerId
where r.QuestionId = #questionId
and (a.mask & #checkedMask) = 0
-- insert
insert QuestionResult (AnswerId, QuestionId)
select
AnswerId,
QuestionId
from QuestionAnswer a
where a.QuestionId = #questionId
and (a.mask & #checkedMask) > 0
and not exists(select AnswerId from QuestionResult r
where r.QuestionId = #questionId and r.AnswerId = a.AnswerId)
Sorry to resurect an old thread. I would have thought the only realistic solution is to delete all responses for that question, and create new rows where the checkbox is ticked. Having a column per answer may be efficient as far as updates go, but the inflexibility of this approach is just not an option. You need to be able to add options to a question without having to redesign your database.
Just delete and re-insert. Thats what databases are designed to do, store and retrieve lots of rows of data.
I disagree that regent's answer is denormalized. As long as each answer is not dependent on another column, and is only dependent on the key, it is in 3rd normal form. It is no different than a table with the following fields for a customer name:
CustomerName
(
name_prefix
name_first
name_mi
name_last
name_suffix
city
state
zip
)
Same as
QuestionAnswers
(
Q1answer1
Q1answer2
Q1answerN
)
There really is no difference between the "Question" of name and the multiple answers which may or may not be filled out and the "Question" of the form and the multiple answers that may or may not be selected.

Unique record in Asp.Net SQL

I asked this question previously but the answers weren't what I was looking for.
I created a table in Asp.net without using code. It contains two columns.
YourUserId and FriendUserId
This is a many to many relationship.
Heres what I want:
There can be multiple records with your name as the UserId, there can also be multiple records with FriendUserId being the same...but there cannot be multiple records with both being the same. For example:
Dave : Greg
Dave : Chris
Greg : Chris
Chris : Greg
is good
Dave : Greg
Dave : Greg
is not good.
I right clicked on the table and chose Indexes/Keys. I then put both columns in the columns section and chose to make the unique. I thought this would make them unique as a whole but individually not unique.
If you go to the Dataset, it show keys next to both columns and says that there is a constraint with both columns being checked.
Is there a way of just making sure that you are not inserting a duplicate copy of a record into the table without individual columns being unique?
I tried controling it with my sql insert statement but that did not work. This is what I tried.
INSERT INTO [FriendRequests] ([UserId], [FriendUserId]) VALUES ('"+UserId+"', '"+PossibleFriend+"') WHERE NOT EXIST (SELECT [UserId], [FriendUserId] FROM [FriendRequests])
That didn't work for some reason.
Thank you for your help!
You should create a compound primary key to prevent duplicate rows.
ALTER TABLE FriendRequests
ADD CONSTRAINT pk_FriendRequests PRIMARY KEY (UserID, FriendUserID)
Or select both columns in table designer and right click to set it as a key.
To prevent self-friendship, you'd create a CHECK constraint:
ALTER TABLE FriendRequests
ADD CONSTRAINT ck_FriendRequests_NoSelfFriends CHECK (UserID <> FriendUserID)
You can add the check constraint in the designer by right clicking anywhere in the table designer, clicking "Check constraints", clicking "add", and setting expression to UserID <> FriendUserID
You might want to look at this question
Sounds like you need a composite key to make both fields a single key.
I have a better idea. Create a new table. Called FriendRequestRelationships. Have the following columns
FriendRelationshipId (PRIMARY KEY)
UserId_1 (FOREIGN KEY CONSTRAINT)
UserId_2 (FOREIGN KEY CONSTRAINT)
Put a unique constraint to only allow one relationship wit UserId_1 and UserId_2. This table now serves as your many-to-many relationship harness.
Create a scalar function that can return the FriendUserId for a UserId, lets say it's called fn_GetFriendUserIdForUserId
You can now display your relationships by running the following query
SELECT dbo.fn_GetFriendUserIdForUserId(UserId_1) AS 'Friend1',
dbo.fn_GetFriendUserIdForUserId(UserId_2) AS 'Friend2',
FROM FriendRelationshipId

Resources