I'm writing an application which produces a lot of data to store in a database.
The DB schema is very simple: it's a table with just 4 columns, but I must fill it with more than 30000 rows.
I'm using SQLite and QSql as API.
Data is produced very fast (no sleeps) and I'm using QSqlQuery to insert a row at time.
However it seems that it takes 7-8 seconds to store 100 rows (I'm using QTime for time counting).
I tried using QSqlTableModel but I noticed no performance improvements, even calling QSqlTableModel::submitAll every 1000 rows (QTime shows 70-80 seconds for 1000 rows).
Is there any way to store rows faster? What is the fastest way to fill a table with SQLite?
You could try looking at whether you've got transactions set up correctly; they're expensive because they have to sync to disk to commit.
Also bear in mind that SQLite is more heavily optimized for reading anyway.
You might try dropping any indexes at the start and then adding them back after all records have been imported. Results will vary of course if you're emptying the table first or just appending new records.
Related
I have an AS tabular model that contains a fact table with 20 mil rows. I have partitioned this so only the new rows get added to each day... however occasionally, a historical row (from years ago) will be modified. I can identify this modified row in SQL (using the last modified timestamp) however would it be possible for me to refresh the row in SSAS to reflect this change without having to refresh my entire data model? How would I achieve this?
First, 20 million rows is not a lot. I’m expecting that will only take 5-10 minutes to process unless your SQL queries are very inefficient or very wide. So why bother to optimize something which may be fast enough already?
If you do need to optimize it, you will first want to partition the large fact table by some date element. Since you only have 20 million rows I would suggest partitioning by year. Optimal compression will be achieved with around 8 million rows per partition. Over-partitioning (such as creating thousands of daily partitions) is counter-productive.
When a new row is added you could perform a ProcessAdd to insert just the new records to the partitions in question. However I would recommend just doing a ProcessFull on any year partitions which have any inserts, updates or deletes in SQL.
SSAS doesn’t support updating a specific row. Thus you have to follow the ProcessFull advice above.
There are several code examples including this one which may help you.
Again this may be overkill if you only have 20 million rows.
I am using SQLite because this is for cross-platform. I have about 10 tables with a small amount of data (maybe a few dozen rows each), but then I also have a set of data which might have a million or more rows.
The small dataset isn't really modified that much, just queried, but the large data set will be queried and modified frequently.
Rather than have a single SQLite database with all the tables in it, I was wondering if splitting it into two databases might be smartest.
Basically I'd have one database, lets call it "settings", with the 10 tables in it. I'd then have another database, lets call it "userdata", with the million rows.
I'll be creating a third database called "audits" where I record each change to the "userdata" database. This database is expected to grow (for a short time period).
I am just wondering if people have an opinion as to whether it is a good idea to split my data into multiple databases or if I should just have one massive one.
My thinking is the queries on the "userdata" database might be slightly more efficient since it will only have one table.
Note, this is not for long-term. It is for a short period of time. It will be queried and edited for about a week, then it is done.
I am using websql to store data in a phonegap application. One of table have a lot of data say from 2000 to 10000 rows. So when I read from this table, which is just a simple select statement it is very slow. I then debug and found that as the size of table increases the performance deceases exponentially. I read somewhere that to get performance you have to divide table into smaller chunks, is that possible how?
One idea is to look for something to group the rows by and consider breaking into separate tables based on some common category - instead of a shared table for everything.
I would also consider fine tuning the queries to make sure they are optimal for the given table.
Make sure you're not just running a simple Select query without a where clause to limit the result set.
I have a very simple query:
SELECT count(id), min(id), max(id), sum(size), sum(frames), sum(catalog_size + file_size)
FROM table1
table1 holds around 3000 to 4000 records.
My problem is that it takes around 20 seconds to this query to run. And since it is called more than once, the delay is pretty obvious to the customer.
Is it normal to this query to take 20 seconds? Is there any way to improve the run time?
If I run this same query from SQLite Manager it takes milliseconds to execute. The delay only occurs if the query is called from our software. EXPLAIN and EXPLAIN QUERY PLAN didn't help much. We use SQLite 3.7.3 version and Windows XPe.
Any thoughts how to troubleshoot this issue or improve the performance of the query?
All the sums require that every single record in the table must be read.
If the table contains more columns than those shown above, then the data read from the disk contains both useful and useless values (especially if you have big blobs). In that case, you can try to reduce the data needed to be read for this query by creating a covering index over exactly those columns needed for this query.
In SQLite versions before 3.7.15, you need to add an ORDER BY for the first index field to force SQLite to use that index, but this doesn't work for all queries. (For your query, try updating to this beta, or wait for 3.7.15.)
i have an dataset around 1000 records in it. i have made changes around 50 rows in dataset,
i have created an new
dataset dsnew= ds.GetChanges();
now this dsNew would contain all the new 50 rows which we made changes, now i want to update these values to Database,
here i do not want to call 5o times my update command or stored procdure to update my values to table which would really decrease the performance
is there any better way to solve it.
Thanks Prince
This age old request to MS, but still you can try for following options,
ADO.Net batch update.
Note : For insertion you can use SQL Bulk Copy,