Update second sqlite database from first database - sqlite

I have database A which gets new data every second day. Now I have a second database B, which is a duplicate of database A, which is stored on a different server for read access.
What I do at the moment is, I copy the whole Database A via rsync to replace database B. This is no problem in regards to locking and reading, because I know exactly when =the writing into A has finished. Also no problem concerning access to B.
Now A is getting quite large, and the copying becomes unreliable because of network errors so that I regularly have to copy the database a second time.
Is there a way (I am sure there is) to let sqlite do the updating of B? I have seen .clone, but, again, it would transfer a large amount of data, when only a small amount has changed.
Any suggestions, on how I can make the process of adding the new data added to A to B at a later time more efficient (using sqlite3)?

Related

Meta-data from SQLite

Is there any way to query a SQLite database for basic meta data such as:
Last date/time updated
Hash of database to indicate "state"
I am just looking for a simple, infrastructural way to have a script evaluate different databases and take a reasonable point of view on whether they are the same "state" as other databases in a different environment (PROD and DEV for instance).
In my experience, if no update, new record, or any change is made to the SQLite database file, the last modified time of the file doesn't change. So the last modified time should suffice for the time of any change made to database.
If 2 database files with same state are only accessed for reading, their modified times are always the same.
Similarly you get the file sizes for comparison.
You can use the whole file to calculate hash. If you consider same data in the database as the same "state" regardless of any difference in the past, then maybe you want hash of the all records in database, which is probably not simple.

sqlite: online backup is not identical to original

I'm doing an online backup of an (idle) database using the example 2 code from here. The backup file is not identical to the original (the length is the same, but it differs in 3 bytes), although the .dump from both databases is identical. Backup files taken at different times are identical to each other.
This isn't great, as I'd like a simple guarantee that the backup is identical to the original, and I'd like to record checksums on the actual database and the backups to simplify restores. Any idea if I can get around this, or if I can use the backup API to generate files that compare identically?
The online backup can write into an existing database, so this writing is done inside a transaction.
At the end of such a transaction, the file change counter (offsets 24-27) is changed to allow other processes to detect that the database was modified and that any caches in those processes are invalid.
This change counter does not use the value from the original database because it might be identical to the old value of the destination database.
If the destination database is freshly created, the change counter starts at zero.
This is likely to be a change from the original database, but at least it's consistent.
The byte at offset 28 was decreased because the database has some unused pages.
The byte at offset 44 was changed because the database does not actually use new schema features.
You might be able to avoid these changes by doing a VACUUM before the backup, but this wouldn't help for the change counter.
I would not have expected them to be identical, just because the backup API ensures that any backups are self consistent (ie transactions in progress are ignored).

Attaching two memory databases

I am collecting data every second and storing it in a ":memory" database. Inserting data into this database is inside a transaction.
Everytime one request is sending to server and server will read data from the first memory, do some calculation, store it in the second database and send it back to the client. For this, I am creating another ":memory:" database to store the aggregated information of the first db. I cannot use the same db because I need to do some large calculation to get the aggregated result. This cannot be done inside the transaction( because if one collection takes 5 sec I will lose all the 4 seconds data). I cannot create table in the same database because I will not be able to write the aggregate data while it is collecting and inserting the original data(it is inside transaction and it is collecting every one second)
-- Sometimes I want to retrieve data from both the databses. How can I link both these memory databases? Using attach database stmt, I can attach the second db to the first one. But the problem is next time when a request comes how will I check the second db is exist or not?
-- Suppose, I am attaching the second memory db to first one. Will it lock the second database, when we write data to the first db?
-- Is there any other way to store this aggregated data??
As far as I got your idea, I don't think that you need two databases at all. I suppose you are misinterpreting the idea of transactions in sql.
If you are beginning a transaction other processes will be still allowed to read data. If you are reading data, you probably don't need a database lock.
A possible workflow could look as the following.
Insert some data to the database (use a transaction just for the
insertion process)
Perform heavy calculations on the database (but do not use a transaction, otherwise it will prevent other processes of inserting any data to your database). Even if this step includes really heavy computation, you can still insert and read data by using another process as SELECT statements will not lock your database.
Write results to the database (again, by using a transaction)
Just make sure that heavy calculations are not performed within a transaction.
If you want a more detailed description of this solution, look at the documentation about the file locking behaviour of sqlite3: http://www.sqlite.org/lockingv3.html

How can i improve the performance of the SQLite database?

Background: I am using SQLite database in my flex application. Size of the database is 4 MB and have 5 tables which are
table 1 have 2500 records
table 2 have 8700 records
table 3 have 3000 records
table 4 have 5000 records
table 5 have 2000 records.
Problem: Whenever I run a select query on any table, it takes around (approx 50 seconds) to fetch data from database tables. This has made the application quite slow and unresponsive while it fetches the data from the table.
How can i improve the performance of the SQLite database so that the time taken to fetch the data from the tables is reduced?
Thanks
As I tell you in a comment, without knowing what structures your database consists of, and what queries you run against the data, there is nothing we can infer suggesting why your queries take much time.
However here is an interesting reading about indexes : Use the index, Luke!. It tells you what an index is, how you should design your indexes and what benefits you can harvest.
Also, if you can post the queries and the table schemas and cardinalities (not the contents) maybe it could help.
Are you using asynchronous or synchronous execution modes? The difference between them is that asynchronous execution runs in the background while your application continues to run. Your application will then have to listen for a dispatched event and then carry out any subsequent operations. In synchronous mode, however, the user will not be able to interact with the application until the database operation is complete since those operations run in the same execution sequence as the application. Synchronous mode is conceptually simpler to implement, but asynchronous mode will yield better usability.
The first time SQLStatement.execute() on a SQLStatement instance, the statement is prepared automatically before executing. Subsequent calls will execute faster as long as the SQLStatement.text property has not changed. Using the same SQLStatement instances is better than creating new instances again and again. If you need to change your queries, then consider using parameterized statements.
You can also use techniques such as deferring what data you need at runtime. If you only need a subset of data, pull that back first and then retrieve other data as necessary. This may depend on your application scope and what needs you have to fulfill though.
Specifying the database with the table names will prevent the runtime from checking each database to find a matching table if you have multiple databases. It also helps prevent the runtime will choose the wrong database if this isn't specified. Do SELECT email FROM main.users; instead of SELECT email FROM users; even if you only have one single database. (main is automatically assigned as the database name when you call SQLConnection.open.)
If you happen to be writing lots of changes to the database (multiple INSERT or UPDATE statements), then consider wrapping it in a transaction. Changes will made in memory by the runtime and then written to disk. If you don't use a transaction, each statement will result in multiple disk writes to the database file which can be slow and consume lots of time.
Try to avoid any schema changes. The table definition data is kept at the start of the database file. The runtime loads these definitions when the database connection is opened. Data added to tables is kept after the table definition data in the database file. If changes such as adding columns or tables, the new table definitions will be mixed in with table data in the database file. The effect of this is that the runtime will have to read the table definition data from different parts of the file rather than at the beginning. The SQLConnection.compact() method restructures the table definition data so it is at the the beginning of the file, but its downside is that this method can also consume much time and more so if the database file is large.
Lastly, as Benoit pointed out in his comment, consider improving your own SQL queries and table structure that you're using. It would be helpful to know your database structure and queries are the actual cause of the slow performance or not. My guess is that you're using synchronous execution. If you switch to asynchronous mode, you'll see better performance but that doesn't mean it has to stop there.
The Adobe Flex documentation online has more information on improving database performance and best practices working with local SQL databases.
You could try indexing some of the columns used in the WHERE clause of your SELECT statements. You might also try minimizing usage of the LIKE keyword.
If you are joining your tables together, you might try simplifying the table relationships.
Like others have said, it's hard to get specific without knowing more about your schema and the SQL you are using.

Why does clearing an SQLite database not reduce its size?

I have an SQLite database.
I created the tables and filled them with a considerable amount of data.
Then I cleared the database by deleting and recreating the tables. I confirmed that all the data had been removed and the tables were empty by looking at them using SQLite Administrator.
The problem is that the size of the database file (*.db3) remained the same after it had been cleared.
This is of course not desirable as I would like to regain the space that was taken up by the data once I clear it.
Did anyone make a similar observation and/or know what is going on?
What can be done about it?
From here:
When an object (table, index, trigger, or view) is dropped from the database, it leaves behind empty space. This empty space will be reused the next time new information is added to the database. But in the meantime, the database file might be larger than strictly necessary. Also, frequent inserts, updates, and deletes can cause the information in the database to become fragmented - scrattered out all across the database file rather than clustered together in one place.
The VACUUM command cleans the main database by copying its contents to a temporary database file and reloading the original database file from the copy. This eliminates free pages, aligns table data to be contiguous, and otherwise cleans up the database file structure.
Databases sizes work like water marks e.g. if the water rises the water mark goes up, when the water receeds the water mark stays where it was
You should look into shrinking databases

Resources