We have a long request that does a catalog search and then calculates some information and then stores it in another document. Do call is made to index the document after the store.
In the logs we're getting errors such as INFO ZPublisher.Conflict ConflictError at /pooldb/csvExport/runAgent: database conflict error (oid 0x017f1eae, class BTrees.IIBTree.IISet, serial this txn started with 0x03a30beb4659b266 2013-11-27 23:07:16.488370, serial currently committed 0x03a30c0d2ca5b9aa 2013-11-27 23:41:10.464226) (19 conflicts (0 unresolved) since startup at Mon Nov 25 15:59:08 2013)
and the transaction is getting aborted and restarted.
All the documentation I've read says that due to MVVC in ZODB ReadConflicts no longer occur. Since this is written in RestrictedPython putting in a savepoint is not a simple option (but likely my only choice I'm guessing).
Is there another to avoid this conflict? If I need use savepoints, can anyone think of a security reason why I shouldn't whitelist the savepoint transaction method for use in PythonScripts?
The answer is there really is no way to do a large transaction that involves writes, when there are other small transactions on the same objects at the same time. BTrees are supposed to have some special conflict handling code but it doesn't seem to work in the cases I'm dealing with.
The only way I think to manage this is something like this code which breaks the reindex into smaller transactions to reduce the chance of a conflict. There really should be something like this built into zope/plone.
Related
New to AWE. I have a situation where data and/or configuration and/or code has become "broken" or scrambled or corrupted while building this approval process.
I'd like to clear out all stuck transactions and start again.
Monitor Approvals has the status for all approval process as status = 'Approved'. They are not. So, I'd like to just clear out all the transaction data and start again.
Which tables do I need to clear out (update and/or delete)?
I've found that since AWE is (for now) built on top of plain, old Workflow I was able to get myself out of trouble by simply updating the Worklist items' status to approved for the user, and clearing the header and XREF tables:
update psworklist set inststatus = 3 where oprid='TheUser';
delete from ps_your_xref;
delete from ps_your_header_rec;
Warning: This might well be a very cowboy approach, has worked fine for me in Development.
On the linux server, where our web app runs, we also have a small app that uses
sqlite (it is written in c).
For performing database actions we use the following commands:
sqlite3_prepare_v2
sqlite3_bind_text or sqlite3_bind_int
sqlite3_step
sqlite3_finalize
Every now and then there was a concurrency situation and I got the following error:
database is locked
So I thought: "This happens when one process writes a certain record and the
other one is trying to read exactly the same record."
So after every step-command, where this collision could be, I checked for this error. When it happended, I waited a few milliseconds and the tried again.
But the sqlite error "database is locked" still occurred.
So I changed every step command and the code lines after it. Somehow I thought that this "database is locked" error could only occur with the step command.
But the error kept coming.
My question is now:
Do I have to check after any sqlite3 command for "error_code ==5" (database is locked)?
Thanks alot in advance
If you're receiving error code 5 (busy) you can limit this by using an immediate transaction. If you're able to begin an immediate transaction, SQLite guarantees that you won't receive a busy error until you commit.
Also note that SQLite doesn't have row-level locking. The entire database is locked. Using a WAL journal, you can one writer and multiple readers. With other journaling methods, you can have either one writer, or multiple readers, but not both simultaneously.
SQLite Documentation on 'SQLITE_BUSY'
We have a new beta version of our software with some changes, but not around our database layer.
We've just started getting Error 3128 reported in our server logs. It seems that once it happens, it happens for as long as the app is open. The part of the code where it is most apparent is where we log data every second via SQLite. We've generated 47k errors on our server this month alone.
3128 Disk I/O error occurred. Indicates that an operation could not be completed because of a disk I/O error. This can happen if the runtime is attempting to delete a temporary file and another program (such as a virus protection application) is holding a lock on the file. This can also happen if the runtime is attempting to write data to a file and the data can't be written.
I don't know what could be causing this error. Maybe an anti-virus program? Maybe our app is getting confused and writing data on top of each other? We're using async connections.
It's causing lots of issues and we're at a loss. It has happened in our older version, but maybe 100 times in a month rather than 47,000 times. Either way I'd like to make it happen "0" times.
Possible solution: Exception Message: Some kind of disk I/O error occurred
Summary: There is probably not a problem with the database but a problem creating (or deleting) the temporary file once the database is opened. AIR may have permissions to the database, but not to create or delete files in the directory.
One answer that has worked for me is to use the PRAGMA statement to set the journal_mode value to something other than DELETE. You do this by issuing a PRAGMA statement in the same way you would issue a query statement.
PRAGMA journal_mode = OFF
Unfortunately, if the application crashes in the middle of a transaction when the OFF journaling mode is set, then the database file will very likely go corrupt.1.
1 http://www.sqlite.org/pragma.html#pragma_journal_mode
The solution was to make sure database delete, update, insert only happened one at at time by wrapping a little wrapper. On top of that, we had to watch for error 3128 and retry. I think this is because we have a trigger running that could lock the database after we inserted data.
We have a new beta version of our software with some changes, but not around our database layer.
We've just started getting Error 3128 reported in our server logs. It seems that once it happens, it happens for as long as the app is open. The part of the code where it is most apparent is where we log data every second via SQLite. We've generated 47k errors on our server this month alone.
3128 Disk I/O error occurred. Indicates that an operation could not be completed because of a disk I/O error. This can happen if the runtime is attempting to delete a temporary file and another program (such as a virus protection application) is holding a lock on the file. This can also happen if the runtime is attempting to write data to a file and the data can't be written.
I don't know what could be causing this error. Maybe an anti-virus program? Maybe our app is getting confused and writing data on top of each other? We're using async connections.
It's causing lots of issues and we're at a loss. It has happened in our older version, but maybe 100 times in a month rather than 47,000 times. Either way I'd like to make it happen "0" times.
Possible solution: Exception Message: Some kind of disk I/O error occurred
Summary: There is probably not a problem with the database but a problem creating (or deleting) the temporary file once the database is opened. AIR may have permissions to the database, but not to create or delete files in the directory.
One answer that has worked for me is to use the PRAGMA statement to set the journal_mode value to something other than DELETE. You do this by issuing a PRAGMA statement in the same way you would issue a query statement.
PRAGMA journal_mode = OFF
Unfortunately, if the application crashes in the middle of a transaction when the OFF journaling mode is set, then the database file will very likely go corrupt.1.
1 http://www.sqlite.org/pragma.html#pragma_journal_mode
The solution was to make sure database delete, update, insert only happened one at at time by wrapping a little wrapper. On top of that, we had to watch for error 3128 and retry. I think this is because we have a trigger running that could lock the database after we inserted data.
I can't delete a collection it's telling me false every single time..
I do a getCollections() it gives me a lot of tmp.mr.mapreduce_1299189206_1618_inc (the ones I want to drop) I thought they were deleted during disconnection, but they're not (in my case).
Then when I do: db["tmp.mr.mapreduce_1299188705_5595"].drop() I always get false and it's not deleted.
The logs are not really helpful:
Wed Mar 9 11:05:51 [conn4] CMD: drop customers.tmp.mr.mapreduce_1299188705_5595
Now I maxed up my namespaces and I cannot create more collections help?
BTW, I can go down, this is not production (or even in production I can put it down too).
Now I maxed up my namespaces and I cannot create more collections help?
By default MongoDB has 20k namespaces, so that's a lot of dead M/R collections. According to the logs, the DB is getting the request to drop the collection. So the question now is whether or not MongoDB has gotten into a bad state.
Can you take it down and re-start to ensure that all connections are closed?
Is it a lot of data? Can you take it down and run --repair?