Folks
I am implementing a file based queue (see my earlier question) using sqlite. I have the following threads running in background:
thread-1 to empty out a memory structure into the "queue" table (an insert into "queue" table).
thread-1 to read and "process" the "queue" table runs every 5 to 10 seconds
thread-3 - runs very infrequently and purges old data that is no longer needed from the "queue" table and also runs vacuum so the size of the database file remains small.
Now the behavior that I would like is for each thread to get whatever lock it needs (waiting with a timeout if possible) and then completing the transaction. It is ok if threads do not run concurrently - what is important is that the transaction once begins does not fail due to "locking" errors such as "database is locked".
I looked at the transaction documentation but there does not seem to be a "timeout" facility (I am using JDBC). Can the timeout be set to a large quantity in the connection?
One solution (untried) I can think of is to have a connection pool of max 1 connection. Thus only one thread can connect at a time and so we should not see any locking errors. Are there better ways?
Thanx!
If it were me, I'd use a single database connection handle. If a thread needs it, it can allocate it within a critical section (or mutex, or similar) - this is basically a poor man's connection pool with only one connection in the pool:) It can do its business with the databse. When done, it exits the critical section (or frees the mutex or ?). You won't get locking errors if you carefully use the single db connection.
-Don
Related
I am learning the architecture of MySQL and InnoDB, and the thread model combined with the pluggable engine system confuses me. It is claimed that MySQL instance is one process with many threads, and InnoDB has many background threads such as master-thread, io threads to deal with callbacks of kernel aio. What's more, I find that the connection pool is managed by the MySQL layer not the pluggable InnoDB layer.
Then how do MySQL threads manage connections and their requests, and how does MySQL request go to kernel aio does it cooperate with InnoDB io thread?
A connection is given its own process (or thread, depending on OS). That process runs along as if it were a single process, except for access to a large number of caches, tables, etc. To deal with them, there are many "mutexes". These are used to give exclusive access to one process to, say, insert/delete something in the buffer_pool or table cache, etc.
In the old days, there were fewer Mutexes. For example, to do anything with the buffer pool a single (or maybe a few) Mutex granted access. This was fine when there was one CPU or maybe two. But when servers had 8 cores, they realized that this was not fine-grained enough. So there was an overhaul of the mutexes. (IIRC, it was spearheadded by Percona.)
Because of these mutexes, 4 processes could work "simultaneously" before they stumbled over each other. More that 8 cores --> throughput actually went down. (And latency went through the roof.)
As new versions came out, that number increased. With MySQL 8, it is possibly over 64.
Another thing that happened was "instances" of buffer_pools and table_open_caches. This was a straightforward way to split a mutex into several.
I am less familiar with the I/O thread model. I suspect the thread (for one connection) somehow signals another thread to do the work for it.
Also, several things are left to background tasks: flushing to logs, updating index blocks from the "change buffer", read-ahead, flushing of "dirty" pages from the buffer_pool, cleaning up BTrees, etc. Note that all of these would use the appropriate Mutex(es) so as not to step on each other and the various connection processes.
The "pluggable engine" system needs a different kind of chatter -- between the generic handler and the engine-specific code. I suspect this is orthogonal to the mutexes, background threads, etc.
The "connection pool" would be yet-another array (or list) of something, and multiple processes would want to grab/release an item in that array. Hence its own mutex(es).
I am reading what affects the durability of InnoDB and find 2 config fields. I have the following questions:
What is the difference between the 2 config fields? My first guess is that innodb_flush_log_at_trx_commit affects InnoDB redo log, and sync_binlog affects the MySQL standard binlog. Am I right?
My second question is if the logging process is divided into 3 phases: write to buffer, write to os cache, flush to disk. Then in which phase does the secondary replication happen?
about sync_binlog, I have another question. If sync_binlog is set to 0, according to innodb doc, the flush is not on commits but delegated to OS. Could it be the case that the binlog is synced before commit so that replication see data not committed?
Add innodb_flush_method to the list.
innodb_flush_log_at_trx_commit should be 1 for security (won't lose any data in a crash) or 2 for speed. (0, I think, is there for historic reasons and has no advantage.)
sync_binlog avoids a Slave from trying to read off the end of the Master's binlog (after a crash). Since the data was already sent to the Slaves, it is not a "data loss" issue, but an annoying error that is easily rectified by hand (move it to the next binlog).
I think this is the order of the replication steps. Note: In the case of a transaction, nothing happens until the COMMIT. (See also "binlog_cache_size".)
Send data to Slaves and flush the copy to the binlog if sync_binlog is on (else let it eventually be flushed). (I don't know which happens first, or even whether they are done by separate threads.)
The I/O thread on the Slave copies the data to its "relay log".
Eventually (usually right away), the execute thread performs the query.
(Multi-source replication and parallel execution on the Slave add further complications.)
Semi-sync gets involved somewhere.
Galera -- see "gcache", etc.
What do you mean by "secondary replication".
Your item 3 may be referring to an obscure case where something could drop through the cracks because too many pieces of hardware are involved.
If you want reliability, see Galera Cluster and/or InnoDB Cluster. This goes beyond what is available with simple Master-Slave replication even with semi-sync.
I've been programming for just a few years, and we have a default dll used for data access. It seems like there has been some data-mining or site scraping going on here lately, and although there are no issues with our SQL database connections, many of the programs that access the as/400 are keeping connections open and idle for long periods of time. I looked through our default data access dll and added code to close the connection after each function, but that didn't help. I have little experience with db2 / as/400 ... how do I close all of these open / idle connections from the code?
If you're using connections pools, that's working as designed.
Are you sure the connection is actually open? How are you determining that?
If you're just seeing locks held by the QZDASOINIT job on the IBM i, then that's also by design. The system will hard close tables (cursors) after the first use. When used again by the same job, the system will only pseudo-close them; in order to provide faster response when they are re-used.
If an operation needing exclusive access is attempted, the system will hard close the pseudo closed cursor.
I’ve two questions about SQLite VACUUM (and possibly WAL):
• If multiple processes have DB open, do all SQL statements in all processes need to be finalized for VACUUM to succeed?
• Why would VACUUM sometime not have effect (no space reclaimed) but Sqlite return SQLITE3_OK?
A bit more details about my problem:
I’ve database in WAL mode accessed by 2 processes. At some point, user has a choice of dropping the data from the database. Because database can be opened by multiple processes, I delete the records and then run VACUUM to reclaim the disk space (instead of closing connection and deleting the file).
The problem is that if 2 processes have DB connection opened, VACUUM from one of the processes returns OK, but does not reclaim the space really.
I think what happens is that VACUUM won’t succeed till there’s any outstanding SQL statement from any process. The problem is that I do not want to make those two processes aware of each other.
What I am considering is doing VACUUM from both processes, so that whichever closes the connection last (upon user request to drop the data), takes care of space reclamation. I am also considering auto_vacuum (I am aware of its limitations, but DELETEs are not very frequent on this database.
The documentation says:
A VACUUM will fail if there is an open transaction, or if there are one or more active SQL statements when it is run.
Just as with any other kind of database modification, a writable transaction requires that no other read or write transaction is active; this usually requires that all statements are reset or finalized.
When in WAL mode, a write transaction is not blocked by other read transactions, but this requires that old data be kept. You should initiate a truncate checkpoint in addition to the vacuum.
The easiest way to handle this would be to not run VACUUM at all; the free space will be reused later.
I have an IIS Web Server that hosts 400 web applications (distributed across 30 application pools). They are both ASP.NET applications and WCF Services end points. The server has 32GB of RAM and is usually running fast; although it's running at 95% memory usage. Worker processes each take between 500MB and 1.5GB of RAM.
I also have another box running SQL Server. That one has plenty of free memory.
Sometimes, the Web Server starts throwing SQL Timeout exceptions. A few per minutes at first and rapidly increasing to hundreds per minute; effectively making the server down. This problem affects applications in all pools. Some requests still complete but most of them don't. While this happens the CPU usage on the server is around 30% (which is the normal load on that box).
While this is happening, we can still use SQL Server Management Studio (from the IIS Server) to execute requests successfully (and fast).
The fix is to restart IIS. And then everything goes back to normal until the next time.
Because the server is running with very low memory, I feel like this is the cause. But I cannot explain the relationship between low memory and sudden bursts of SQL Timeout exceptions.
Any idea?
Memory pressure can trigger paging and garbage collection. Both introduce latency which would not be present otherwise.
GC'ing 32GB of data can take seconds. Why would all app processes GC at the same time? Because at about 95% memory utilization Windows sets a "low memory" event that the CLR listens to. It will try to release memory to help other processes.
If the applications get into a paging frenzy that would also explain huge delays in normal execution.
This is just guessing, though. You can try proving it by looking at the "Hard page faults/sec" counter. There also must be a counter for "full GC" or "Gen 2 GC".
The fix would be running at a higher margin to the physical memory limit.
The first problem is to discover where the timeout is happening. Can you tell from the stack trace if the timeout is happening when executing a request against the database, or when connecting to the database? (Or even connecting to the web server?)
Timeouts executing database requests can be a variety of causes. The problem might be in the database with blocking processes, database maintenance (also locking), deadlocks, etc. When apps are running slowly, do you see a lot of entries in sys.dm_exec_requests, and if so, what are their wait_types?
Even if you can run SQL in the query window while the web server is timing out, that doesn't mean there isn't massive blocking or deadlocking going on.
If it is a timeout connecting to the database, then it is possible the ADO connection pools are overwhelmed and not getting cleaned up, or the database has a connection limit, and the web services are timing out waiting for a connection.
One of the best ways to find out what is going on is to capture a memory dump of the w3wp.exe process and analyze it. Even if you aren't adept at a debugger like WinDbg, Microsoft's DebugDiag tool can produce some nice reports with helpful information.
SqlCommand.CommandTimeout
This property is the cumulative time-out for all network reads during command execution or processing of the results. A time-out can still occur after the first row is returned, and does not include user processing time, only network read time.
It is a client based time out. If stuff is getting queued due to memory constraints then that could cause a timeout.
Are you retrieving a lot of data from these queries?
If some queries return a lot of data consider breaking them up and give the user a next and prior button.
Have you considered asynch like BeginExecuteReader?
The advantage is no timeout.
It does not release the calling thread.
isExecutingFTSindexWordOnce = true;
sqlCmdFTSindexWordOnce.BeginExecuteNonQuery(callbackFTSindexWordOnce, sqlCmdFTSindexWordOnce);
// isExecutingFTSindexWordOnce set to false in the callback
Debug.WriteLine("Calling thread active");
But I agree with your comment how to respond to the request as the answer does not come back to the calling thread.
Sorry I am used to WPF where I just update a public property on the call back.