In the original documents for WAL the difference between FULL and NORMAL synchronization is clearly stated1:
Write transactions are very fast since they only involve writing the
content once (versus twice for rollback-journal transactions) and
because the writes are all sequential. Further, syncing the content to
the disk is not required, as long as the application is willing to
sacrifice durability following a power loss or hard reboot. (Writers
sync the WAL on every transaction commit if PRAGMA synchronous is set
to FULL but omit this sync if PRAGMA synchronous is set to NORMAL.)
But I can't find anywhere the effect of PRAGMA synchronous = OFF in WAL mode. I suspect it is the same as NORMAL. Anyone has an informed answer?
The documentation also says:
In WAL mode when synchronous is NORMAL (1), the WAL file is synchronized before each checkpoint and the database file is synchronized after each completed checkpoint and the WAL file header is synchronized when a WAL file begins to be reused after a checkpoint.
So with OFF, there is no difference for normal transactions, but any checkpoint will be unsafe.
Related
I am reading what affects the durability of InnoDB and find 2 config fields. I have the following questions:
What is the difference between the 2 config fields? My first guess is that innodb_flush_log_at_trx_commit affects InnoDB redo log, and sync_binlog affects the MySQL standard binlog. Am I right?
My second question is if the logging process is divided into 3 phases: write to buffer, write to os cache, flush to disk. Then in which phase does the secondary replication happen?
about sync_binlog, I have another question. If sync_binlog is set to 0, according to innodb doc, the flush is not on commits but delegated to OS. Could it be the case that the binlog is synced before commit so that replication see data not committed?
Add innodb_flush_method to the list.
innodb_flush_log_at_trx_commit should be 1 for security (won't lose any data in a crash) or 2 for speed. (0, I think, is there for historic reasons and has no advantage.)
sync_binlog avoids a Slave from trying to read off the end of the Master's binlog (after a crash). Since the data was already sent to the Slaves, it is not a "data loss" issue, but an annoying error that is easily rectified by hand (move it to the next binlog).
I think this is the order of the replication steps. Note: In the case of a transaction, nothing happens until the COMMIT. (See also "binlog_cache_size".)
Send data to Slaves and flush the copy to the binlog if sync_binlog is on (else let it eventually be flushed). (I don't know which happens first, or even whether they are done by separate threads.)
The I/O thread on the Slave copies the data to its "relay log".
Eventually (usually right away), the execute thread performs the query.
(Multi-source replication and parallel execution on the Slave add further complications.)
Semi-sync gets involved somewhere.
Galera -- see "gcache", etc.
What do you mean by "secondary replication".
Your item 3 may be referring to an obscure case where something could drop through the cracks because too many pieces of hardware are involved.
If you want reliability, see Galera Cluster and/or InnoDB Cluster. This goes beyond what is available with simple Master-Slave replication even with semi-sync.
I’ve two questions about SQLite VACUUM (and possibly WAL):
• If multiple processes have DB open, do all SQL statements in all processes need to be finalized for VACUUM to succeed?
• Why would VACUUM sometime not have effect (no space reclaimed) but Sqlite return SQLITE3_OK?
A bit more details about my problem:
I’ve database in WAL mode accessed by 2 processes. At some point, user has a choice of dropping the data from the database. Because database can be opened by multiple processes, I delete the records and then run VACUUM to reclaim the disk space (instead of closing connection and deleting the file).
The problem is that if 2 processes have DB connection opened, VACUUM from one of the processes returns OK, but does not reclaim the space really.
I think what happens is that VACUUM won’t succeed till there’s any outstanding SQL statement from any process. The problem is that I do not want to make those two processes aware of each other.
What I am considering is doing VACUUM from both processes, so that whichever closes the connection last (upon user request to drop the data), takes care of space reclamation. I am also considering auto_vacuum (I am aware of its limitations, but DELETEs are not very frequent on this database.
The documentation says:
A VACUUM will fail if there is an open transaction, or if there are one or more active SQL statements when it is run.
Just as with any other kind of database modification, a writable transaction requires that no other read or write transaction is active; this usually requires that all statements are reset or finalized.
When in WAL mode, a write transaction is not blocked by other read transactions, but this requires that old data be kept. You should initiate a truncate checkpoint in addition to the vacuum.
The easiest way to handle this would be to not run VACUUM at all; the free space will be reused later.
I want to use WAL mode for the reason of performance and reliability. However my environment doesn't have mmap() function, so I can't compile SQLite with WAL (WAL needs mmap().). Although setting PRAGMA locking_mode=EXCLUSIVE allow use of WAL without mmap() (In this case, WAL-index is created on heap memory not shared file), it is not good solution for the applications that manage multiple database connections.
I use SQLite with multiple database connections (for inter-thread concurrency) in one process, and mmap() seems to be only used for inter-process memory sharing. Since I expect there is a way to use WAL without mmap() in single-process environment. But I can't find the good solution. Are there any ideas to solve this problem?
Thanks.
The code in the SQLibrary always creates a memory-mapped file for the wal-index.
If you want to use WAL without mmap, you have to change that code to use a plain block of shared memory instead.
There's a following statement in SQLite FAQ:
A transaction normally requires two complete rotations of the disk platter, which on a 7200RPM disk drive limits you to about 60 transactions per second.
As I know there's a cache on the harddisk and there might be also an extra cache in the disk driver that abstract the operation that is perceived by the software from the actual operation against the disk platter.
Then why and how exactly are transactions so strictly bound to disk platter rotation?
From Atomic Commit In SQLite
2.0 Hardware Assumptions
SQLite assumes that the operating
system will buffer writes and that a
write request will return before data
has actually been stored in the mass
storage device. SQLite further assumes
that write operations will be
reordered by the operating system. For
this reason, SQLite does a "flush" or
"fsync" operation at key points.
SQLite assumes that the flush or fsync
will not return until all pending
write operations for the file that is
being flushed have completed. We are
told that the flush and fsync
primitives are broken on some versions
of Windows and Linux. This is
unfortunate. It opens SQLite up to the
possibility of database corruption
following a power loss in the middle
of a commit. However, there is nothing
that SQLite can do to test for or
remedy the situation. SQLite assumes
that the operating system that it is
running on works as advertised. If
that is not quite the case, well then
hopefully you will not lose power too
often.
Because it ensures data integrity by making sure the data is actually written on to the disk rather than held in memory. Thus if the power goes off or something, the database is not corrupted.
This video http://www.youtube.com/watch?v=f428dSRkTs4 talks about reasons why (e.g. because SQLite is actually used in a lot of embedded devices where the power might well suddenly go off.)
Folks
I am implementing a file based queue (see my earlier question) using sqlite. I have the following threads running in background:
thread-1 to empty out a memory structure into the "queue" table (an insert into "queue" table).
thread-1 to read and "process" the "queue" table runs every 5 to 10 seconds
thread-3 - runs very infrequently and purges old data that is no longer needed from the "queue" table and also runs vacuum so the size of the database file remains small.
Now the behavior that I would like is for each thread to get whatever lock it needs (waiting with a timeout if possible) and then completing the transaction. It is ok if threads do not run concurrently - what is important is that the transaction once begins does not fail due to "locking" errors such as "database is locked".
I looked at the transaction documentation but there does not seem to be a "timeout" facility (I am using JDBC). Can the timeout be set to a large quantity in the connection?
One solution (untried) I can think of is to have a connection pool of max 1 connection. Thus only one thread can connect at a time and so we should not see any locking errors. Are there better ways?
Thanx!
If it were me, I'd use a single database connection handle. If a thread needs it, it can allocate it within a critical section (or mutex, or similar) - this is basically a poor man's connection pool with only one connection in the pool:) It can do its business with the databse. When done, it exits the critical section (or frees the mutex or ?). You won't get locking errors if you carefully use the single db connection.
-Don