Doctrine: atomic updates and exceptions in a loop - symfony

We are migrating a project from a more basic ORM to using Symfony+Doctrine. In the project we have a lot of cron jobs looking like this:
$rows = $someRepository->getRows();
foreach ($rows as $row) {
try {
$db->beginTransaction(); //simple begin transaction in db
//do some handling of data
// Maybe load some other entities and update those
// ...
$db->commit();
} catch (Throwable $t) {
//log error
//clear entity cache
$db->rollback(); //simple rollback in db
}
}
When we did it this way, all changes within the try catch was atomic while it at the same time was possible to recover from an error and continue on the next $row.
In Symfony+Doctrine, I simply cannot figure out how to mimic this behaviour. The recommendation from Doctrine to handle an exception is closing the EntityManager, but how do you recover?

The ORM does this implicitly on flush, so most of the time you can avoid the hassle of doing so on your own.
However, if you want clear demarcation you can still do it explicitly, in a similar manner you did so far.
More reading and examples here: https://www.doctrine-project.org/projects/doctrine-orm/en/2.7/reference/transactions-and-concurrency.html
EDIT related to the comment below:
Instead of injecting the manager, you should inject the registry.
After that on catch, you can check if the $em->isOpen(), and call $registry->resetManager() if not.
I suspect this will also reset the unit of work, so you might encounter detached entities. In that case you should do $em->merge();
One thing to note here is that an expection is not considered normal in doctrine, so they are closing the manager because of that. You might think that this is overcompicated - yes it is, because you are working against the philosophy here. Validate your data if you can. Read this section: https://www.doctrine-project.org/projects/doctrine-orm/en/2.7/reference/transactions-and-concurrency.html#exception-handling
As for the why: (This is not offical, just based on my knowledge) The managers internal unit of work is a stateful object. When an exception occures during a transaction that state will remain the same, but couln't be persisted to the database. If they let this go that would mean the EM would try to apply all state changes again, and would encounter the same exception again. So no point in leaving it open in the same state, a reset is needed.

Related

Doctrine find() and querybuilder() return different result in PHPUnit test

With Doctrine and Symfony in my PHPUnit test method :
// Change username for user #1 (Sheriff Woody to Chuck Norris)
$form = $crawler->selectButton('Update')->form([
'user[username]' => 'Chuck Norris',
]);
$client->submit($form);
// Find user #1
$user = $em->getRepository(User::class)->find(1);
dump($user); // Username = "Sheriff Woody"
$user = $em->createQueryBuilder()
->from(User::class, 'user')
->andWhere('user.id = :userId')
->setParameter('userId', 1)
->select('
user
')
->getQuery()
->getOneOrNullResult()
;
dump($user); // Username = "Chuck Norris"
Why my two methods to fetch the user #1 return different results ?
diagnosis / explanation
I assume* you already created the User object you're editing via crawler before in that function and checked that it is there. This leads to it being a managed entity.
It is in the nature of data, to not sync itself magically with the database, but some automatism must be in place or some method executed to sync it.
The find() method will always try to use the cache (unless explicitly turned off, also see side note). The query builder won't, if you explicitly call getResult() (or one of its varieties), since you explicitly want a query to be executed. Executing a different query might lead to the cache not being hit, producing the current result. (it should update the first user object though ...) [updated, due to comment from Arno Hilke]
((( side note: Keeping objects in sync is hard. It's mainly about having consistency in the database, but all of ACID is wanted. Any process talking to the database should assume, that it only is working with the state at the moment of its first query, and is the only user of the database. Unless additional constraints must be met and inconsistent reads can occur, in which case isolation levels should be raised (See also: transactions or more precisely: isolation). So, automatically syncing is usually not wanted. Doctrine uses certain assumptions for performance gains (mainly: isolation / locking is optimistic). However, in your particular case, all of those things are of no actual concern... since you actually want a non-repeatable read. )))
(* otherwise, the behavior you're seeing would be really unexpected)
solution
One easy solution would be, to actively and explicitly sync the data from the database by either calling $em->refresh($user), or - before fetching the user again - to call $em->clear(), which will detach all entities (clearing the cache, which might have a noticable performance impact) and allowing you to call find again with the proper results being returned.
Please note, that detaching entities means, that any object previously returned from the entity manager should be discarded and fetched again (not via refresh).
alternate solution 1 - everything is requests
instead of checking the database, you could instead do a different request to a page that displays the user's name and checks that it has changed.
alternate solution 2 - using only one entity manager
using only one entity manager (that is: sharing the entity manager / database in the unit test with the server on the request) may be a reasonable solution, but it comes with its own set of problems. mainly omitted commits and flushes may avoid detection.
alternate solution 3 - using multiple entity managers
using one entity manager to set up the test, since the server is using a new entity manager to perform its work, you should theoretically - to do this actually properly - create yet another entity manager to check the server's behavior.
comment: the alternate solutions 1,2 and 3 would work with the highest isolation level, the initial solution probably wouldn't.

intershop ORMException could not update - refresh ORMObject

In a clustered intershop environment, we see a lot of error messages. I'm suspecting the communication between the application servers is not reliable.
Caused by: com.intershop.beehive.orm.capi.common.ORMException:
Could not UPDATE object: com.intershop.beehive.bts.internal.orderprocess.basket.BasketPO
Is there safe way to for the local application server, to load the latest instance.
BasketPO basket = null;
try{
BasketPOFactory factory = (BasketPOFactory) NamingMgr.getInstance().lookupFactory(BasketPOFactory.FACTORY_NAME);
try(ORMObjectCollection<BasketPO>baskets = factory.getObjectsBySQLWhere("uuid=?", new Object[]{basketID},CacheMode.NO_CACHING);){
if(null != baskets && !baskets.isEmpty()){
basket = baskets.stream().findFirst().get();
}
}
}
catch(Throwable t){
Logger.error(this, t.getMessage(),t);
}
Does the ORMObject#refresh method help ?
try{
if(null != basket)
basket.refresh();
}
catch(Throwable t){
Logger.error(this, t.getMessage(),t);
}
You experience that error because an optimistic lock "fails". To understand the problem better I'll try to explain how the optimistic locking works in particular in the Intershop ORM layer.
There is a column named OCA in the PO tables (OCA == optimistic control attribute?). Imagine that two servers (or two different threads/transactions) try to update the same row in a table. For performance reasons there is no DB locking involved by default (e.g. by issuing select for update). Instead the first thread/server increments the OCA by one when it updates the row successfully within its transaction.
The second thread/server knows the value of the OCA from the time that it created its own state. It then tries to update the row by issuing a similar query:
UPDATE ... OCA = OCA + 1 ... WHERE UUID = <uuid> AND OCA = <old_oca>
Since the OCA is already incremented by the first thread/server this update fails (in reality - updates 0 rows) and the exception that you posted above is thrown when the ORM layer detects that no rows were updated.
Your problem is not the inter-server communication but rather the fact that either:
multiple servers/threads try to update the same object;
there are direct updates in the database that bypass the ORM layer (less likely);
To solve this you may:
Avoid that situation altogether (highly recommended by me :-) );
Use the ISH locking framework (very cumbersome imHo);
Use pesimistic locking supported by the ISH ORM layer and Oracle (beware of potential performance issues, deadlocks, bugs);
Use Java locking - but since the servers run in different JVM-s this is rarely an option;
OFFTOPIC remarks: I'm not sure why you use getObjectsBySQLWhere when you know the primary key (uuid). As far as I remember ORMObjectCollection-s should be closed if not iterated completely.
UPDATE: If the cluster is not configured correctly and the multicasts can't be received from the nodes you won't be able to resolve the problems programatically.
The "ORMObject.refresh()" marks the cached shared state as invalid. Next access to the object reloads the state from the database. This impacts the performance and increase the database server load.
BUT:
The "refresh()" method does not reload the PO instance state if it already assigned to the current transaction.
Would be best to investigate and fix the server communication issues.
Other possibility is that it isn't a communication problem (multicast between node in the cluster i assume), but that there are simply two request trying to update the basket at the same time. Example two ajax request to update something on the basket.
I would avoid trying to "fix" the orm, it would only cause more harm than good. Rather investigate further and post back more information.

Handling doctrine transactions on multiple entity managers

Both transactions should be rollback-ed if;
$em1 fails but $em2 succeeds.
$em1 succeeds but $em2 fails.
So, is my example below correct way of dealing with transactions when more than one EMs are involved? I've come up with it after reading Transactions and Concurrency documentation.
$em1->getConnection()->beginTransaction();
$em2->getConnection()->beginTransaction();
try {
$em1->persist($object1);
$em1->flush();
$em1->getConnection()->commit();
$em2->persist($object2);
$em2->flush();
$em2->getConnection()->commit();
} catch (Exception $e) {
$em1->getConnection()->rollback();
$em2->getConnection()->rollback();
}
The reason I'm trying to implement this because I'm getting ....resulted in a Doctrine\ORM\ORMException exception (The EntityManager is closed.) error somewhere along the line in the application. I can probably handle it with the method below but I think using transaction for the business logic above is better.
private function getNewEntityManager($em)
{
if (!$em->isOpen()) {
$em = $em->create($em->getConnection(), $em->getConfiguration());
}
return $em;
}
Your example code actually does work, which surprises me because Francesco Panina is (or should be) correct that $em1->getConnection()->commit()
will commit the first transaction and you will loose [sic] the privilege to rollback such transaction should an error arise from the second transaction.
However, something in the way that Doctrine handles transaction nesting levels means that you actually can still rollback the first transaction when an error arises from the second transaction.
Nonetheless, best practice would be to not depend on this behavior and instead put both commits at the very end of your try block, as so:
$em1->getConnection()->beginTransaction();
$em2->getConnection()->beginTransaction();
try {
$em1->persist($object1);
$em1->flush();
$em2->persist($object2);
$em2->flush();
$em1->getConnection()->commit();
$em2->getConnection()->commit();
} catch (Exception $e) {
$em1->getConnection()->rollback();
$em2->getConnection()->rollback();
throw $e;
}
With this small change, your example does demonstrate the correct way to deal with transactions that span multiple entity managers.
I'd like to point out a couple of things that may clear your mind on the matter:
I'm not aware of the process you use to create the second entity manager, keep in mind that 2 completely different entity manager will not share the same connection. Can you point out your use case for 2 different entity manager?
Consider that the operation:
$em1->getConnection()->commit();
will commit the first transaction and you will loose the privilege to rollback such transaction should an error arise from the second transaction.
Doctrine\ORM\ORMException exception (The EntityManager is closed.)
It's typical when you try to operate any commit/flush operation after a DBAL (database related) exception has been thrown; in this case Doctrine default behaviour is to close the entity manager.
And it is common practice to do so after any rollback:
$em1->getConnection()->rollback();
$em1->close();
Hope it helps,
Regards.

Creating many batches (SysOperation Framework) very quickly doing similar processes - "Cannot edit a record in LastValue (SysLastValue)"?

I have a SysOperation Framework process that creates a ReliableAsynchronous batch to post packing slips and several get created at a time.
Depending on how quickly I click to create them, I get:
Cannot edit a record in LastValue (SysLastValue).
An update conflict occurred due to another user process deleting the record or changing one or more fields in the record.
And
Cannot create a record in LastValue (SysLastValue). User ID: t edit a, Class.
The record already exists.
On a couple of them in the BatchHistory. I have this.parmLoadFromSysLastValue(false); set. I'm not sure how to prevent writing to SysLastValue table.
Any idea what could be going on?
I get this exception a lot too, so I've created the habit of catching DuplicateKeyException in my service operation. When it is thrown, catch it and retry (for a default of 5x).
The error occurs when a lot of processes run simultaneously, like you are doing now.
DupplicateKeyException can be caught inside a transaction so you could improve by putting a try/catch around the code that does the insert in the SysLastValue table if you can find the code.
As far as I can see these are the only to occurrences where a record is inserted in this table (except maybe in kernel):
InventUnusedDimCleanUp.serialize()
SysAutoSemaphore.autoSemaphore()
Put a breakpoint there and see if that code is executed. If so you can add a try/catch with retry and see if that "fixes" it.
You could also use the tracing cockpit and the trace parser to figure out where that record is inserted if it's not one of those two.
My theory about LoadFromSysLastValue: I believe setting this.parmLoadFromSysLastValue(false) does not work since it is only taken into account when the dialog is started, not when your operation is executed. When in batch, no SysLastValue will be used to initialize your data contract as you want it to use the exact parameters you have supplied in your data contract .
It's because of the code calling SysOperationController.savelast() while in batch, my solution is to set loadFromSysLastValue to false in SysOperationController.loadFromSysLastValue() as part of the in batch check:
if (!this.isInBatch())
{
.....
}
//Begin
else
{
loadFromSysLastValue = false;
}
//End

Symfony2, Doctrine2, Entity insert then update at once

Simple thing, but doesn't work. We have at the bottom part of script
$oMan = $this->getContainer()->get('doctrine')->getManager();
// add entry of calling
$oLastCall = new CronLastCall();
$oLastCall->setType('key');
$oMan->persist($oLastCall);
$oMan->flush();
we insert to db once we create it, then do some stuff that can take a few minutes. Then call this one.
$oLastCall->setDateEnd(new \DateTime('now'));
$oMan->flush();
after this one - exist from method\action. So regarding logic (and doctrine2 manual I read) entity that were created already become 'managed' (we persist it) and we can simply update it. (I call flush (at the end) to update this entity, but it not updating.)
Where is trouble?
As far as I can tell, there's nothing wrong with the code you've shown. But if the $oLastCall object becomes detached between the first and second code blocks, you have to re-attach (merge) it to the manager so that it detects the changes for the second flush.
Merging can be done in this way:
$oLastCallMerged = $oMan->merge($oLastCall);
$oLastCallMerged->setDateEnd(new \DateTime('now'));
$oMan->flush();
You can also check the state (MANAGED/NEW/DETACHED/REMOVED) of an object using this code:
$oMan->getUnitOfWork()->getEntityState($oLastCall);
If that doesn't help (i.e. detachment of the object isn't your problem), you need to give more info about the context this code runs in and any errors you get. Is this code part of a Console Command or a regular web-app Controller? Do you get any output or errors when running it in 'dev' environment? (Check .../app/logs/dev.log.) Does the $oLastCall object stay in memory while waiting for the stuff that takes some minutes, or do you reload it from somewhere?
Btw, objects doesn't magically get detached by themselves. They'll only be detached if you load them from a different source than the entity manager (for example storing them in the session between requests) OR if you explicitly detach them by calling $oMan->detach($entity) or $oMan->clear().
Edit
You can also check if Doctrine detects the change by echoing out the changeset using $oMan->getUnitOfWork()->getEntityChangeSet($oLastCall) before and after the change, e.g:
error_log(json_encode($oMan->getUnitOfWork()->getEntityChangeSet($oLastCall)));
$oLastCall->setDateEnd(new \DateTime('now'));
error_log(json_encode($oMan->getUnitOfWork()->getEntityChangeSet($oLastCall)));
After flushing the manager all objects will be removed internally from the manager. So if you want to update the object later you need to call persist again before flushing.

Resources