Can we save multiple records for one customer? [duplicate] - sqlite

I'm quite new to SQLite and SQL and I am struggling with how to approach the following:
My app will display a list of community members. If I click a member, I can see a list of posts made by the members. A post is an object with name, time and message. How can I store this in an SQLite database so that I can query the database by userid and get the list of posts for a specific user.
I have a Users table with these columns:
USER_ID | NAME
I have a Tweet table with these columns:
USER_ID | NAME | TIME | MESSAGE
My questions are: what the best approach / structure to link these two tables? Do I create a new tweet table for every user, or do I store all tweets in one long table with tweets for user 1 first then for user 2 etc?
I'm not necessarily looking for code dumps but rather an explanation of the logic.

An answer has been given and accepted already, but I wanted to add this.
What you want is one table with users, users. In this table you store your user information (user_id, name).
In your Tweets table, store all tweets for all users. One tweet per row. I'm using tweet_id as PRIMARY KEY for the Tweets table.
You can then 'link' the two in code by doing a JOIN like Dave Swersky said.
For example:
Users
--------------------------
user_id | user_name
--------------------------
123 | 'Terry'
34 | 'Pierre'
Tweets
-----------------------------------------------------------------------
tweet_id | user_id | time | message
-----------------------------------------------------------------------
0 | 123 | 135646 | 'This is a tweet'
1 | 123 | 132646 | 'This is another tweet by Terry'
2 | 34 | 352646 | 'Pierre\'s tweet'
I'm not sure what name is for in your Tweets table. As far as I know tweets do not have a name/subject(?). You do not need to store the user name in both the tweets and users table.
For a quick SQLFiddle, go here: http://www.sqlfiddle.com/#!2/43492/1/0
Join
SELECT u.user_id, u.name, t.time, t.message, t.time
FROM my_users u
INNER JOIN tweets t ON u.user_id = t.user_id

This is a typical "JOIN" scenario where you have a one-to-many relationship between Users and Posts.
Here is an example of a query that would display all users and their posts:
SELECT u.User_ID, u.Name, p.Time, p.Message
FROM Users u INNER JOIN Posts p ON u.User_ID = p.User_ID
This will produce a resultset with four columns. Each "Tweet" will be displayed with its related User record. The 'u.' and 'p.' syntax are table aliases used to make the query easier to read.

You need to have two tables:
1.Users
USER_ID | NAME
2.TWEETS
USER_ID | TIME | MESSAGE
Now for the explanation:
Table 1 is represents the users, there is all the data about the user, like name, phone, address etc.
Table 2 is for all the tweets of all the users, and there is a column that connects between user and his tweet.
In table 2 USER_ID is foreign key, that points to exactly one row in the users table.
To get all the tweets for one user, you can write the next query:
Select TWEETS.MESSAGE, TWEETS.TIME
from Users, TWEETS
where Users.USER_ID = TWEETS.USER_ID
and Users.NAME = "Pierre";

Related

DynamoDB Single Table Schema Design with Adjacency Lists

I am trying to understand how to properly design a DynamoDB schema. I've read a few articles, watched some YouTube videos but, to be honest, I don't yet feel quite comfortable.
This is what I am trying to design properly:
two entities, "location" (id & name) and "vehicle" (id & name)
a location can have 0-n vehicles
a vehicle can be in 0-1 locations
Access patterns:
get a list of all available locations (id & name)
get a list of all available vehicles and their current location (id, name, location-id, location-name)
get a list of all vehicles in a given location (id, name)
I've read about adjacency lists and because there will be n-m relations I've decided to give it a try.
This is what I've came up with:
# | PK (GSI1-SK) | SK (GSI1-PK) | DATA
==|======================|====================|==============
1 | LOCATION#locationId1 | A | locationName1
2 | LOCATION#locationId2 | A | locationName2
3 | LOCATION#locationId1 | VEHICLE#vehicleId1 |
4 | LOCATION#locationId1 | VEHICLE#vehicleId2 |
5 | LOCATION#locationId2 | VEHICLE#vehicleId3 |
6 | VEHICLE#vehicleId1 | A | vehicleName1
7 | VEHICLE#vehicleId2 | A | vehicleName2
8 | VEHICLE#vehicleId3 | A | vehicleName3
#1-2 & #6-8 are my entity records, those with additional data for the entity itself (e.g. its name).
#3-5 is an example of how I would design a relationship. I've added an inverted GSI in order to be able to search in both ways.
Back to my access patterns:
get a list of all available locations (id & name)
query GSI1 for SK=A and PK begins with LOCATION#
get a list of all available vehicles and their current location (id, name, location-id, location-name)
query GSI1 for SK=A and PK begins with VEHICLE#
for each result item, query GSI1 for SK=VEHICLE#vehicleId and PK begins with LOCATION#
for each result item, query table for PK=LOCATION#locationId and SK=A
... this doesn't seem right
get a list of all vehicles in a given location (id, name)
query table for PK=LOCATION#locationId and SK begins with VEHICLE#
for each result item, query table for PK=VEHICLE#vehicleId and SK=A
... this doesn't seem right
Adjacency lists look like a nice and clean way to design complex relationships but either I am doing something wrong (probably) or they come with alot of querys that are necessary to look things up.
Any advice is appreciated.
I modelled this in DynamoDB Workbench:
Main Index (PK -> SK)
GSI1 (PK1 -> SK)
In order to:
"get a list of all available locations (id & name)"
select * from GS1 where PK1="ALL#LOCATION"
get a list of all available vehicles and their current location (id, name, location-id, location-name)
select * from MAIN-INDEX where PK="ALL#VEHICLE"
get a list of all vehicles in a given location (id, name)
select * from GSI1 where PK1="LOC#ID"
Several things to here:
It's important to distribute the traffic across all partition keys. I'm using "ALL#" partition keys in this design. Ideally you shard that somehow, there are several tricks like using dates or timestamp to the beginning of the day. You can randomly spread them across a fixed number of "ALL#" records and then randomly query 1 if your use case allows it. If you have millions of locations this is probably ok. That's how you take these decisions: think of the traffic and the behaviour of the data.
In order to use both indexes I put the "ALL#LOCATION" and the "ALL#VEHICLE" partition keys in different indexes.
Notice that vehicle 4 doesn't have a PK1. See what happens to GSI1. This is what's called a sparse index.
I denormalized the vehicle-location relationship. Assuming that the location ID and the location name are immutable it's ok to do this, the problem is when the attributes you denormalize are mutable, avoid that if possible.

SQLite query to get records with certain condition

My SQLite table name is Invoices, having columns Part Number and Manufacturer.
My problem is to query the table in such a manner that it shows only records where part number have at least 2 different unique manufacturers.
I researched the stack over flow and I tried this solution
QString Filter = "PART_NUMBER in (select PART_NUMBER FROM Invoices GROUP BY "
"PART_NUMBER HAVING count(PART_NUMBER)>1)";
model->setFilter(Filter);
model->select();
But this solution's problem is it shows part number having same manufacturer also.
Edit:
In this example it should return part 2 only
You need to count Manufacturer:
select PART_NUMBER FROM Invoices GROUP BY "
"PART_NUMBER HAVING count(MANUFACTURER)>1
Ok, so you're saying that your data looks like this:
PART_NUMBER | MANUFACTURER
1 | A
2 | A
2 | A (duplicate entry)
3 | A
3 | B
4 | A
4 | B
Then you'd need to select HAVING COUNT(DISTINCT(MANUFACTURER)).
In sqlite, this looks a bit more complex:
SELECT COUNT(MANUFACTURER) FROM (SELECT DISTINCT MANUFACTURER FROM Table WHERE ...);
See this blog post.
But that's more than QSqlQueryModel can do with setFilter(...).
This problem looks like a database design issue. Do you know about database normalization?
When you've normalized your tables, the problem becomes significantly simplified.

How to solve a spool space error with rank () over partition by SQL optimising?

I have a table holding information about contacts made to many different customers in the format
email_address | treatment_group | customer_id | contact_date |
I am trying to add a column that looks at each distinct customer and numbers the contacts they've received from longest ago to most recent. I'm using this code:
explain create table db.responses_with_rank
as
( select a.*,
rank () over (partition by customer_id order by contact_date asc)as xrank
from db.responses_with_rank
)
with data
primary index (email_address, treatment_group)
My query is spooling out. There is a primary index of email_address, treatment_group that leads to a skew factor of 1.1 and a secondary primary index on customer_id. I've collected statistics on both sets of indexes. The table is quite large - around 200M records. Is there something I can try to optimize this query?
There is not enough information to determine the cause of the error.
For start, please add the following to your question:
TD version (select * from dbc.dbcinfo)
Execution plan
The statistics collection commands you have used
customer_id top frequencies (select top 10 customer_id,count(*) from db.responses_with_rank group by 1 order by 2 desc)
Do you have wide text columns in your table?
P.s.
I strongly recommend to use create multiset table and not create table.

Multiple tables with similar columns in a single DB

I am a web developer who's working on an Exam Generator project. Now I am stuck at one point.
I have one database with different tables. Four of these tables are somehow similar with their columns. I want to know what is the best practice for such thing.
The similar tables I have are:
Exam (Used to store the exam name and the number of questions
included in the exam).
ID | ExamName | NumberofQuestions
UserExam (Used to store the the exams availble for a user with his
grade in each exam he took).
ID | MemberID | ExamID | Grade
QuestionExam (Used to store Question IDs included in each exam).
ID | ExamID | QuestionID
UserSolution (Used to store the user's answers for each exam he
took).
ID | MemberID | ExamID | QuestionID | UserAnswer
In the beginning, I wanted to merge the "Exam" table with the "QuestionExam" table, but then I asked myself if I merged them how would I have one ID for each exam? So I kept it as it is.
Everything is correct with the tables. IF you want to be more desciriptive some practices say to value like Exam_ID, UserExam_ID, or UserSolution_ID that way you can distinguish between the two in a join. It all depends on personal preference. It is a little more writing but saves you from a headache in the long run.

What's the best way to retrieve this data?

The architecture for this scenario is as follows:
I have a table of items and several tables of forms. Rather than having the forms own the items, the items own the forms. This is because one item can be on several forms (although only one of each type, but not necessarily on any). The forms and items are all tied together by a common OrderId. This can be represented like so:
OrderItems | Form A | Form B etc....
---------- |--------- |
ItemId |FormAId |
OrderId |OrderId |
FormAId |SomeField |
FormBId |OtherVar |
FormCId |etc...
This works just fine for these forms. However, there is another form, (say, FormX) which cannot have an OrderId because it consists of items from multiple orders. OrderItems does contain a column for FormXId as well, but I'm confused about the best way to get a list of the "FormX"s related to a single OrderId. I'm using MySQL and was thinking maybe a stored proc was the best way to go on this, but I've never used a stored proc on MySQL and don't really know the best way to go about it. My other (kludgy) option was to hit the DB twice, first to get all the items that are for the given OrderId that also have a FormXId, and then get all their FormXIds and do a dynamic SELECT statement where I do something like (pseudocode)
SELECT whatever FROM sometable WHERE FormXId=x OR FormXId=y....
Obviously this is less than ideal, but I can't really think of any other way... anything better I could do either programmatically or architecturally? My back-end code is ASP.NET.
Thanks so much!
UPDATE
In response to the request for more info:
Sample input:
OrderId = 1000
Sample output
FormXs:
-----------------
FormXId | FieldA | FieldB | etc
-------------------------------
1003 | value | value | ...
1020 | ... .. ..
1234 | .. . .. . . ...
You see the problem is that FormX doesn't have one single OrderId but is rather a collection of OrderIds. Sometimes multiple items from the same order are on FormX, sometimes it's just one, most orders don't have any items on FormX. But when someone pulls up their order, I need for all the FormXs their items belong on to show up so they can be modified/viewed.
I was thinking of maybe creating a stored proc that does what I said above, run one query to pull down all the related OrderIds and then another to return the appropriate FormXs. But there has to be a better way...
I understand you need to get a list of the "FormX"s related to a single OrderId. You say, that OrderItems does contain a column for FormXId.
You can issue the following query:
select
FormX.*
From
OrderItems
join
Formx
on
OrderItems.FormXId = FormX.FormXId
where
OrderItems.OrderId = #orderId
You need to pass #orderId to your query and you will get a record set with FormX records related to this order.
You can either package this query up as a stored procedure using #orderId paramter, or you can use dynamic sql and substitute #orderId with real order number you executing your query for.

Resources