Why does this transaction produce a deadlock? - mariadb

My application runs on Maria DB using a master-master Galera replication setup.
The application can handle deadlocks, but I've been working to minimize those that occur as they fill up my log files. There remains one transaction that gets regular deadlocks and I don't know how to avoid it.
The process deletes a record from one table, does a couple of operations on other tables and then finally inserts a record into the original table.
The transaction looks broadly like this:
1. DELETE FROM table_a WHERE `id` = 'Foo'
2. REPLACE INTO table_b ( ... )
3. UPDATE table_c SET ....
4. INSERT INTO table_a (id,...) VALUES ('Bar',...)
The final insert regularly gets a deadlock although retrying the transaction fixes it. What is it about this pattern that causes a deadlock? What can I do to reduce the occurrence?

Question: Is the 'deadlock' in the node you are writing to? Or does the deadlock not occur until COMMIT; that is, when trying to reconcile across the cluster?
If on the writing node...
As soon as possible in the transaction, do
SELECT id FROM table_a WHERE ... FOR UPDATE;
to signal what row(s) you will be inserting in step 4.
Also, consider changing REPLACE to an equivalent INSERT .. ON DUPLICATE KEY UPDATE ... I don't know if it will directly help with the deadlock, but at least it is (probably) more efficient.
If on the cluster...
Are you touching lots of different rows? Are you using roundrobin for picking which node to write to?
In any case, speeding up the transaction will help. Is there anything that can be pulled out of the transaction. Some thoughts there:
Normalization can generally be done in its own transaction.
If you have an "if" in the transaction, it might be worth it to do a tentative test beforehand. (But you probably need to keep the "if" in the transaction.)

Related

Can MariaDB return incomplete data?

I am using MySQL Connector to connect to a MariaDB server.
A function in my program periodically retrieves all entries in a table (with a select * from ... without any wheres, limits, etc.).
After it gets the data, it checks if these rows (using an auto-incremented id) are already present in its memory, and if not it adds them. But if a row does not exist in the retrieved list but presents in the memory-list, then that row must be deleted from memory.
Deleting that row from memory is not the only thing that's gonna happen. It also deletes a bunch of other tables/files linked to that row. So, if the connector somehow fails, does not retrieve the full list, and does not report this, then I'll get into trouble.
It might be a tad stupid question but I couldn't make sure if I needed any additional safety measures.

Controlling read locks on table for multithreaded plsql execution

I have a driver table with a flag that determines whether that record has been processed or not. I have a stored procedure that reads the table, picks a record up using a cursor, does some stuff (inserts into another table) and then updates the flag on the record to say it's been processed. I'd like to be able to execute the SP multiple times to increase processing.
Obvious answer seemed to be to use 'for update skip locked' in the select for the cursor but it seems this means I cannot commit within the loop (to update the processed flag and commit my inserts) without getting the fetch out of sequence error.
Googling tells me Oracle's AQ is the answer but for the time being this option is not available to me.
Other suggestions? This must be a pretty common request but I've been unable to find anything that useful.
TIA!
A

Deadlock Caused due to (row level Locking and) delete then insert in an table with no constraints

We have an online application which checks room availability. A procedure is used and it uses a table (with no constraints) temporarily to delete its contents first and then insert the searched rows (searched data using select queries in cursor). Another user from another session deletes the table data and inserts again the table data.
Some observations from deadlock graph:
1. Enqueue type is TX (after seeing the deadlock graph), so, this is definitely not locking due to unindexed foreign keys.
The mode the lock is being waited on is 'X' (exclusive), so there is row-level locking.

Can a Select Count(*) Affect Writes in Cassandra

I experienced a scenario where a select count(*) on a table every minute (yes, this should definitely be avoided) caused a huge increase in Cassandra writes to around 150K writes per second.
Can anyone explain this weird behavior? Why would a Select query significantly increase write count in Cassandra?
Thanks!
If you check
org.apache.cassandra.metrics:type=ReadRepair,name=RepairedBackground
and
org.apache.cassandra.metrics:type=ReadRepair,name=RepairedBlocking
metrics you can see if its read repairs sending mutations. Perhaps reading all the data to service the count(*) is causing a lot of read repairs if your data is inconsistent. If thats the case lowering the read_repair_chance and dclocal_read_repair_chance on the table (ALTER TABLE) could reduce load.
Other likely possibilities are:
You have tracing enabled (either globally or on the table) as some %.
Or if you use DSE and you have slow query's enabled.
A possible explanation could be found in the write path of an update:
During a write , Cassandra adds each new row to the database without checking on whether a duplicate record exists. This policy makes it possible that many versions of the same row may exist in the database.
Then
Most Cassandra installations store replicas of each row on two or more nodes. Each node performs compaction independently. This means that even though out-of-date versions of a row have been dropped from one node, they may still exist on another node.
And finally:
This is why Cassandra performs another round of comparisons during a read process. When a client requests data with a particular primary key, Cassandra retrieves many versions of the row from one or more replicas.

Sqlite3: How to interrupt a long running update without roll back?

I have a long running multirow update such as:
UPDATE T set C1 = calculation(C2) where C1 is NULL
If table is large this update may take many seconds or even minutes.
During this time all other queries on this table fail with "database is locked"
after connection timeout expires (currently my timeout is 5 seconds).
I would like to stop this update query after, say, 3 seconds,
then restart it. Hopefully, after several restarts entire table will be updated.
Another option is to stop this update query before making any other request
(this will require inter-process cooperation, but it may be doable).
But I cannot find a way to stop update query without rolling back
all previously updated records.
I tried calling interrupt and returning non-0 from progress_handler.
Both these approaches abort the update command
and roll back all the changes.
So, it appears that sqlite treats this update as a transaction,
which does not make much sense in this case because all rows are independent.
But I cannot start a new transaction for each row, can I?
If interrupt and progress_handler cannot help me, what else I can do?
I also tried UPDATE with LIMIT and also WHERE custom_condition(C1).
These approaches do allow me to terminate update earlier,
but they are significantly slower than regular update
and they cannot terminate the query at specific time
(before another connection timeout expires).
Any other ideas?
This multirow update is such a common operation
that, I hope, other people have a good solution for it.
So, it appears that sqlite treats this update as a transaction, which does not make much sense in this case because all rows are independent.
No, that actually makes perfect sense, because you're not executing multiple, independent updates. You're executing a single update statement. The fine manual says
No changes can be made to the database except within a transaction.
Any command that changes the database (basically, any SQL command
other than SELECT) will automatically start a transaction if one is
not already in effect. Automatically started transactions are
committed when the last query finishes.
If you can determine the range of keys involved, you can execute multiple update statements. For example, if a key is an integer, and you determine the range to be from 1 to 1,000,000, you can write code to execute this series of updates.
begin transaction;
UPDATE T set C1 = calculation(C2)
where C1 is NULL and your_key between 1 and 100000;
commit;
begin transaction;
UPDATE T set C1 = calculation(C2)
where C1 is NULL and your_key between 100001 and 200000;
commit;
Other possibilities . . .
You can sleep for a bit between transactions to give other queries a chance to execute.
You can also time execution using application code, and calculate a best guess at range values that will avoid timeouts and still give good performance.
You can select the keys for the rows that will be updated, and use their values to optimize the range of keys.
In my experience, it's unusual to treat updates this way, but it sounds like it fits your application.
But I cannot start a new transaction for each row, can I?
Well, you can, but it's probably not going to help. It's essentially the same as the method above, using a single key instead of a range. I wouldn't fire you for testing that, though.
On my desktop, I can insert 100k rows in 1.455 seconds, and update 100k rows with a simple calculation in 420 ms. If you're running on a phone, that's probably not relevant.
You mentioned poor performance with LIMIT. Do you have a lastupdated column with an index on it? At the top of your procedure you would get the COMMENCED_DATETIME and use it for every batch in the run:
update foo
set myvalue = 'x', lastupdated = UPDATE_COMMENCED
where id in
(
select id from foo where lastupdated < UPDATE_COMMENCED
limit SOME_REASONABLE_NUMBER
)
P.S. With respect to slowness:
I also tried UPDATE with LIMIT and also WHERE custom_condition(C1). These approaches do allow me to terminate update earlier, but they are significantly slower than regular update...
If you're willing to give other processes access to stale data, and your update is designed so as not to hog system resources, why is there a need to have the update complete within a certain amount of time? There seems to be no need to worry about perfomance in absolute terms. The concern should be relative to other processes -- make sure they're not blocked.
I also posted this question at
http://thread.gmane.org/gmane.comp.db.sqlite.general/81946
and got several interesting answers, such as:
divide range of rowid into slices and update one slice at a time
use AUTOINCREMENT feature to start new update at the place where the previous update ended (by LIMIT 10000)
create a trigger that calls select raise(fail, ...) to abort update without rollback

Resources