I am using QSQlQuery on a sqlite3 database. To fetch a particular item , I was populating the result from 4 different tables. I thought joining the tables would increase the performance/speed and get the result faster. So I joined 2 tables initially but it takes longer time to fetch the data after joining the tables (?)
Any suggestion on how to improve the performance would be really appreciated. Also, I was looking at the http://qt-project.org/doc/qt-4.8/qsqlquery.html and it is mentioned that using setForwardOnly would increase the performance on some databases. Any idea if it would work for SQLite3?
Thanks!
According to this link,
http://sqlite.org/cvstrac/wiki?p=PerformanceTuning
SQLite implements JOIN USING by translating the USING clausing into some extra WHERE clause terms. It does the same with NATURAL JOIN and JOIN ON. So while those constructs might be helpful to the human reader, they don't really make any difference to SQLite's query optimizer.
-I was wrong to join two tables and expect the fetch to be faster. It does not work with SQLite database. Instead using a "where" clause and joining two results directly definitely has some positive impact on the performance.
(example :
select * from A,B where A.id = B.id where A.id = 1; instead of
select * from A left outer join B on A.id = B.id where A.id = 1)
The SQLite translates first statement to second before compiling and you could save on the small amount of CPU time by directly using the second statement
Related
I am very confused by the cosmosdb documentation on joins. When I think of a join conventionally, I think of 2 tables, with 1 shared id, on which I perform the join. These 2 tables have different schemas, but the result of the join is a combined table with a merge of the columns from both tables. The join for cosmosdb does not seem to me intuitively congruent with that.
I have a collection with heterogenous data. Each document can have a different structure from the next. I want to count the number of documents that have a value that is present in the result set of a subquery. Intuitively, I want to do something like this:
SELECT COUNT(1) as c
FROM CollectionName as outer
where outer.type = "table"
JOIN ((SELECT c.id from c where c.type = "database") as inner) on outer.databaseId == t.id
// count the number of tables that are in deleted databases
It would seem like I would need to do a join on the result of the subquery with the result of the outer query, and then process that resulting table. But I am not understanding right now how to do that:
Select COUNT(1)
from Collection outer
where outer.type = 'table'
JOIN (select c.id from c IN outer.databaseId where c.type = "database" and c.state = "deleted")
I am constantly getting a 400 with the above query. So how am I supposed to think about joins in cosmosdb?
Cosmos is a document database. It stores and operates on json data which can be in hierarchical format. Joins in Cosmos reference tuples within these hierarchies where they can be projected with other data in the document.
There is a really good article that talks through this at pretty deep level but also have lots of examples too, Joins in Cosmos DB.
This takes some getting used to writing queries like this but once you get the hang of it you'll be ok. You can easily practice queries using the Query Playground that has a bunch of sample queries for nutrition dataset with food and ingredients. Or follow along with the families data in the docs. You can create additional items and then write some queries to see how joins work.
Hope that is helpful.
The below crashes my DB Browser. Essentially I am trying to sum sales ("sales") by a sales person ("name") that occurred between two dates ("beg_period" and "end_period") pulled from a separate table.
SELECT ta.name, ta.beg_period, ta.end_period,
(SELECT SUM(tb.sales)
FROM sales_log tb
WHERE ta.name = tb.name
AND tb.date BETWEEN ta.beg_period AND ta.end_period
)
FROM performance ta
;
The nested query can be re-written as a single query with a standard join.
SELECT ta.name, ta.beg_period, ta.end_period, SUM(tb.sales)
FROM performance ta INNER JOIN sales_log tb
ON ta.name = tb.name
WHERE tb.date BETWEEN ta.beg_period AND ta.end_period
GROUP BY ta.name, ta.beg_period, ta.end_period;
My guess is that the original query was okay (however inefficient), but DB Browser just didn't know how to interpret the subquery for whatever parsing it attempts, etc. In other words, just because it crashed DB Browser doesn't mean that it would crash sqlite library. Try another sqlite database manager.
I am learning SQLite and I'm using this database to learn how to correctly use querys but I'm struggling specially when I have to use data from multiple tables to get some information.
For example, with the given database, is there a way to get the first name, last name and the name of songs that every customer has bought?
All you have to do is a simple SQL SELECT query. When you say you're having trouble getting it from multiple tables, I'm not sure if you're trying to get the data from all of the tables in one single query, as that is not necessary. You just need to have multiple instances of the SELECT query, just for different tables (and different column names).
SELECT firstName, lastName, songName FROM table_name
You have to study about JOINS:
select
c.FirstName,
c.LastName,
t.Name
from invoice_items ii
inner join tracks t on t.trackid = ii.trackid
inner join invoices i on i.invoiceid = ii.invoiceid
inner join customers c on c.customerid = i.customerid
In this query there are 4 tables involved and the diagram in the link you posted, shows exactly their relationships.
So you start from the table invoice_items where you find the bought songs and join the other 3 tables by providing the columns on which the join will be set.
One more useful thing to remember: aliases for tables (like c for customers) and if needed for columns also.
You need to use joins to get data from multiple tables. In this case I'd recommend you using inner joins.
In case your are not familiar with joins, this is a very good article that explains the different types of joins supported in SQLite.
SQLite INNER JOINS return all rows from multiple tables where the join
condition is met.
This query will return the first and last name of customers, and the tracks they purchased.
select customers.FirstName,
customers.LastName,
tracks.name as PurchasedTracks from invoice_items
inner join invoices on invoices.InvoiceId = invoice_items.InvoiceId
inner join customers on invoices.CustomerId = customers.CustomerId
inner join tracks on invoice_items.TrackId = tracks.TrackId
order by customers.LastName
I have two tables tool , tool_attribute.
tool has 12 columns and tool_attribute has 5.
Information i needed from the tables :
tool - refid, serial, type, id
tool_attribute - key, value, id (There will be multiple entries for this)
Right now i have around 18264 in tool and 255696 in tool_attribute
Current Query :
select
tool.refid,
tool.serial,
tool_attribute.value,
tool.type
from tool
inner join tool_attribute
on tool.id = tool_attribute.id
where
(tool_attribute.val LIKE '%t00%' or
tool.serial LIKE '%t00%')
group by tool.refid
order by tool.serial asc;
This take around 750ms which is quite fast but i want to make it much faster. I run this code on low memory windows 6.0 device so it takes too much time.
Is there any way i could make it faster ?
You can try adding indices to the columns involved in the join:
CREATE INDEX idx_tool ON tool (id);
CREATE INDEX idx_tool_attr ON tool_attribute (id);
The LIKE conditions in your WHERE clause would preclude any chance of using an index on the columns involved, I think. The reason for this is a LIKE expression of the form %something eliminates the chance to search through a B-tree, which uses the suffix from left to right to find something. If you could rephrase your WHERE logic using something similar to LIKE 'something%' then an index could be used there as well.
A strange thing, that I don't know the cause, is happenning when trying to collect results from a db2 database.
The query is the following:
SELECT
COUNT(*)
FROM
MYSCHEMA.TABLE1 T1
WHERE
NOT EXISTS (
SELECT
*
FROM
MYSCHEMA.TABLE2 T2
WHERE
T2.PRIMARY_KEY_PART_1 = T1.PRIMARY_KEY_PART_2
AND T2.PRIMARY_KEY_PART_2 = T1.PRIMARY_KEY_PART_2
)
It is a very simple one.
The strange thing is, this same query, if I change COUNT(*) to * I will get 8 results and using COUNT(*) I will get only 2. The process was repeated some more times and the strange result is still continuing.
At this example, TABLE2 is a parent table of the TABLE1 where the primary key of the TABLE1 is PRIMARY_KEY_PART_1 and PRIMARY_KEY_PART_2, and the primary key of the TABLE2 is PRIMARY_KEY_PART_1, PRIMARY_KEY_PART_2 and PRIMARY_KEY_PART_3.
There's no foreign key between them (because they were legacy ones) and they have a huge amount of data.
The DB2 query SELECT VERSIONNUMBER FROM SYSIBM.SYSVERSIONS returns:
7020400
8020400
9010600
And the client used is SquirrelSQL 3.6 (without the rows limit marked).
So, what is the explanation to this strange result?
Without the details (including, at least, the exact Db2 version and DDL for both tables and their indexes) it can be just anything, and even with that details only IBM support will be really able to say, what is the actual reason.
Generally this looks like damaged data (e.g. differences in index vs table data).
Worth to open the support case with IBM.