Drupal difference between DatabaseQueue and BatchQueue - drupal

Looking at the Drupal queue documentation it is not clear what the difference is between a DatabaseQueue and a BatchQueue? Are both FIFO? Will they expire? Serial meaning not concurrent processing?

From the source of Batch.php:
/**
* Defines a batch queue handler used by the Batch API.
*
* This implementation:
* - Ensures FIFO ordering.
* - Allows an item to be repeatedly claimed until it is actually deleted (no
* notion of lease time or 'expire' date), to allow multipass operations.
*
* Stale items from failed batches are cleaned from the {queue} table on cron
* using the 'created' date.
*
* #ingroup queue
*/
class Batch extends DatabaseQueue {
Batch extends DatabaseQueue. In my experience, batch is most often used with a form, as it will automatically process the batch after it's built.
Yes, FIFO.
No, items don't expire.
And yes, serial processing.

Related

FreeRTOS implementation with X-Cube-BLE2

I'm trying to implement a BLE task in my FreeRTOS application with the X-Nucleo-BNRG2A1 shield. I completed the BLE application (X-Cube-BLE2 package) without FreeRTOS and it has been working fine. For implementing the FreeRTOS, I have been looking into the BLE_HeartRateFreeRTOS example for reference. It looks like the hci_user_evt_proc() function is different in the BLE_HeartRateFreeRTOS example and the X-Cube-BLE2 default example. The BLE_HeartRateFreeRTOS hci_user_evt_proc() has a the following comment:
/**
* Up to release version v1.2.0, a while loop was implemented to read out events from the queue as long as
* it is not empty. However, in a bare metal implementation, this leads to calling in a "blocking" mode
* hci_user_evt_proc() as long as events are received without giving the opportunity to run other tasks
* in the background.
* From now, the events are reported one by one. When it is checked there is still an event pending in the queue,
* a request to the user is made to call again hci_user_evt_proc().
* This gives the opportunity to the application to run other background tasks between each event.
*/
Moreover, the hci_user_evt_proc() function is called inside the HciUserEvtProcess() task where it waits for a Thread Flag. The Thread Flag is called from hci_notify_asynch_evt callback function. It looks like the hci_notify_asynch_evt function is also different from the X-Cube-BLE2 package. FYI:
static void HciUserEvtProcess(void *argument)
{
UNUSED(argument);
for(;;)
{
osThreadFlagsWait( 1, osFlagsWaitAny, osWaitForever);
hci_user_evt_proc( );
}
}
void hci_notify_asynch_evt(void* pdata)
{
UNUSED(pdata);
osThreadFlagsSet( HciUserEvtProcessId, 1 );
return;
}
My question is, how do I implement FreeRTOS with my X-Nucleo-BLE2 package as the library files are different. Should I include the hci_user_evt_proc() inside a task with same priority with the other tasks so that other tasks will not be blocked? Or is there an update that I'm missing?
Looking for expert helps.

.netcore I want to pull 500 kafka messages at once, how do I configure [duplicate]

As per my understanding Kafka consumer reads messages from an assigned partition sequentially...
We are planning to have multiple Kafka consumer (Java) which has same group I'd ..so if it reads sequentially from an assigned partition then how we can achieve high throughput ..i.e. For Example Producer publishes messages like 40 per sec ...
Consumer process msg 1 per sec ..though we can have multiple consumers but cannot have 40 rt??? Correct me if I'm wrong...
And in our case consumer have to commit offset only after message is processed successfully ..else message will be reprocessed... Is there any better solution???
Based on your question clarification.
A Kafka Consumer can read multiple messages at a time. But a Kafka Consumer doesn't really read messages, its more correct to say a Consumer reads a certain number of bytes and then based on the size of the individual messages, that determines how many messages will be read. Reading through the Kafka Consumer Configs, you're not allowed to specify how many messages to fetch, you specify a max/min data size that a consumer can fetch. However many messages fit inside that range is how many you will get. You will always get messages sequentially as you have pointed out.
Related Consumer Configs (for 0.9.0.0 and greater)
fetch.min.bytes
max.partition.fetch.bytes
UPDATE
Using your example in the comments, "my understanding is if i specify in config to read 10 bytes and if each message is 2 bytes the consumer reads 5 messages at a time." That is true. Your next statement, "that means the offsets of these 5 messages were random with in partition" that is false. Reading sequential doesn't mean one by one, it just means that they remain ordered. You are able to batch items and have them remain sequential/ordered. Take the following examples.
In a Kafka log, if there are 10 messages (each 2 bytes) with the following offsets, [0,1,2,3,4,5,6,7,8,9].
If you read 10 bytes, you'll get a batch containing the messages at offsets [0,1,2,3,4].
If you read 6 bytes, you'll get a batch containing the messages at offsets [0,1,2].
If you read 6 bytes, then another 6 bytes, you'll get two batches containing the messages [0,1,2] and [3,4,5].
If you read 8 bytes, then 4 bytes, you'll get two batches containing the messages [0,1,2,3] and [4,5].
Update: Clarifying Committing
I'm not 100% sure how committing works, I've mainly worked with Kafka from a Storm environment. The provided KafkaSpout automatically commits Kafka messages.
But looking through the 0.9.0.1 Consumer APIs, which I would recommend you do to. There seems to be three methods in particular that are relevant to this discussion.
poll(long timeout)
commitSync()
commitSync(java.util.Map offsets)
The poll method retrieves messages, could be only 1, could be 20, for your example lets say 3 messages were returned [0,1,2]. You now have those three messages. Now it's up you to determine how to process them. You could process them 0 => 1 => 2, 1 => 0 => 2, 2 => 0 => 1, it just depends. However you process them, after processing you'll want to commit which tells the Kafka server you're done with those messages.
Using the commitSync() commits everything returned on last poll, in this case it would commit offsets [0,1,2].
On the other hand, if you choose to use commitSync(java.util.Map offsets), you can manually specify which offsets to commit. If you're processing them in order, you can process offset 0 then commit it, process offset 1 then commit it, finally process offset 2 and commit.
All in all, Kafka gives you the freedom to process messages how to desire, you can choose to process them sequentially or entirely random at your choosing.
To achieve parallelism, which seems to be what you're asking, you use topic partitions (you split topic on N parts which are called partitions).
Then, in the consumer, you spawn multiple threads to consume from those partitions.
On the Producer side, you publish messages to random partition (default) or you provide Kafka with some message attribute to calculate hash (if ordering is required), which makes sure that all msgs with the same hash go to the same partition.
EDIT (example of offset commit request):
This is how I did it. All methods that are not provided are non-essential.
/**
* Commits the provided offset for the current client (i.e. unique topic/partition/clientName combination)
*
* #param offset
* #return {#code true} or {#code false}, depending on whether commit succeeded
* #throws Exception
*/
public static boolean commitOffset(String topic, int partition, String clientName, SimpleConsumer consumer,
long offset) throws Exception {
try {
TopicAndPartition tap = new TopicAndPartition(topic, partition);
OffsetAndMetadata offsetMetaAndErr = new OffsetAndMetadata(offset, OffsetAndMetadata.NoMetadata(), -1L);
Map<TopicAndPartition, OffsetAndMetadata> mapForCommitOffset = new HashMap<>(1);
mapForCommitOffset.put(tap, offsetMetaAndErr);
kafka.javaapi.OffsetCommitRequest offsetCommitReq = new kafka.javaapi.OffsetCommitRequest(
ConsumerContext.getMainIndexingConsumerGroupId(), mapForCommitOffset, 1, clientName,
ConsumerContext.getOffsetStorageType());
OffsetCommitResponse offsetCommitResp = consumer.commitOffsets(offsetCommitReq);
Short errCode = (Short) offsetCommitResp.errors().get(tap);
if (errCode != 0) {
processKafkaOffsetCommitError(tap, offsetCommitResp, BrokerInfo.of(consumer.host()));
ErrorMapping.maybeThrowException(errCode);
}
LOG.debug("Successfully committed offset [{}].", offset);
} catch (Exception e) {
LOG.error("Error while committing offset [" + offset + "].", e);
throw e;
}
return true;
}
You can consume the messages in batches and process them in a batched manner.
batch.max.wait.ms (property)
the consumer will wait this amount of time and polls for new message

Flyway migration status `outOfOrder`?

I have recently enabled outOfOrder in my Flyway config to solve some merge conflicts.
The problem is when I run migrate all my scripts get executed and in status it shows OutOfOrder.
I want to know does OutOfOrder mean success state ?
Yes, that is a successfully applied migration.
See MigrationState#OUT_OF_ORDER for a bit more detail.
/**
* <p>This migration succeeded.</p>
* <p>
* This migration succeeded, but it was applied out of order.
* Rerunning the entire migration history might produce different results!
* </p>
*/
OUT_OF_ORDER("OutOrdr", true, true, false)
/**
* Creates a new MigrationState.
*
* #param displayName The name suitable for display to the end-user.
* #param resolved Flag indicating if this migration is available on the classpath or not.
* #param applied Flag indicating if this migration has been applied or not.
* #param failed Flag indicating if this migration has failed when it was applied or not.
*/
MigrationState(String displayName, boolean resolved, boolean applied, boolean failed) {

Doctrine: Why can't I free memory when accessing entities through an association?

I have an Application that has a relationship to ApplicationFile:
/**
* #ORM\OneToMany(
* targetEntity="AppBundle\Entity\ApplicationFile",
* mappedBy="application",
* cascade={"remove"},
* orphanRemoval=true
* )
*/
private $files;
A file entity has a field that stores binary data, and can be up to 2MB in size. When iterating over a large list of applications and their files, PHP memory usage grows. I want to keep it down.
I've tried this:
$applications = $this->em->getRepository('AppBundle:Application')->findAll();
foreach ($applications as $app) {
...
foreach ($app->getFiles() as $file) {
...
$this->em->detach($file);
}
$this->em->detach($app);
}
Detaching the object should tell the entity manager to stop caring about this object and de-referencing it, but it surprisingly has no effect on the amount of memory usage - it keeps increasing.
Instead, I have to manually load the application files (instead of retrieving them through the association method), and the memory usage does not increase. This works:
$applications = $this->em->getRepository('AppBundle:Application')->findAll();
foreach ($applications as $app) {
...
$appFiles = $this
->em
->getRepository('AppBundle:ApplicationFile')
->findBy(array('application' => $application));
foreach ($appFiles as $file) {
...
$this->em->detach($file);
}
$this->em->detach($app);
}
I used xdebug_debug_zval to track references to the $file object. In the first example, there's an extra reference somewhere, which explains why memory is ballooning - PHP is not able to garbage collect it!
Does anyone know why this is? Where is this extra reference and how do I remove it?
EDIT: Explicitly calling unset($file) at the end of its loop has no effect. There are still TWO references to the object at this point (proven with xdebug_debug_zval). One contained in $file (which I can unset), but there's another somewhere else that I cannot unset. Calling $this->em->clear() at the end of the main loop has no effect either.
EDIT 2: SOLUTION: The answer by #origaminal led me to the solution, so I accepted his answer instead of providing my own.
In the first method, where I access the files through the association on $application, this has a side effect of initializing the previously uninitialized $files collection on the $application object I'm iterating over in the outer loop.
Calling $em->detach($application) and $em->detach($file) only tells Doctrine's UOW to stop tracking the objects, but it doesn't affect the array of $applications I'm iterating on, which now have populated collection of $files which eat up memory.
I have to unset each $application object after I'm done with it to remove all references to the loaded $files. To do this, I modified the loops as such:
$applications = $em->getRepository('AppBundle:Application')->findAll();
$count = count($applications);
for ($i = 0; $i < $count; $i++) {
foreach ($applications[$i]->getFiles() as $file) {
$file->getData();
$em->detach($file);
unset($file);
}
$em->detach($applications[$i]);
unset($applications[$i]);
// Don't NEED to force GC, but doing so helps for testing.
gc_collect_cycles();
}
Cascade
EntityManager::detach should indeed remove all references Doctrine has to the enities. But it does not do the same for associated entities automatically.
You need to cascade this action by adding detach the cascade option of the association:
/**
* #ORM\OneToMany(
* targetEntity="AppBundle\Entity\ApplicationFile",
* mappedBy="application",
* cascade={"remove", "detach"},
* orphanRemoval=true
* )
*/
private $files;
Now $em->detach($app) should be enough to remove references to the Application entity as well as its associated ApplicationFile entities.
Find vs Collection
I highly doubt that loading the ApplicationFile entities through the association, in stead of using the repository to findBy() them, is the source of your issue.
Sure that when loaded through the association, the Collection will have a reference to those child-entities. But when the parent entity is dereferenced, the entire tree will be garbage collected, unless there are other references to those child entities.
I suspect the code you show is pseudo/example code, not the actual code in production. Please examine that code thoroughly to find those other references.
Clear
Sometimes it worth clearing the entire EntityManager and merging a few entities back in. You could try $em->clear() or $em->clear('AppBundle\Entity\ApplicationFile').
Clear has no effect
You're saying that clearing the EntityManager has no effect. This means the references you're searching for are not within the EntityManager (of UnitOfWork), because you've just cleared that.
Doctrine but not Doctrine
Are you using any event-listeners or -subscribers? Any filters? Any custom mapping types? Multiple EntityManagers? Anything else that could be integrated into Doctrine or its life-cycle, but is not necessarily part of Doctrine itself?
Especially event-listeners/subscribers are often overlooked when searching for the source of issues. So I'd suggest you start to look there.
If we are speaking about your first implementation you have extra links to the collection in the PersistentCollection::coll of the Application::files property - this object is created by Doctrine on Application instantiation.
With detach you are just deleting UoW links to the object.
There are different ways to fix this but a lot of hacks should be applied. Most nice way probably to detach also Application object and unset it.
But it is still preferable to use more advanced ways for a batch processing: some were listed in the other answer. The current way forces doctrine to make use proxies and throws extra queries to DB to get the files of the current object.
Edit
The difference between the first and the second implementation is that there are no circular references in the second case: Application::files stays with uninitialized PersistenceCollection (with no elements in coll).
To check this - can you try to drop the files association explicitly?
The trick is in PHP's garbage collector, that works a bit odd. First off all each time the scripts need memory it will allocate memory from RAM, even if you use unset(), $object = null, or other tricks to free the memory, the allocated memory will not be returned to Operating System till the script is not finished and the process associated with it killed.
How to fix that ?
Usually is done on Linux Systems
Create commands that runs the needed script with limit, offset parameters and re-run the needed scripts in small batches more times. In this way, the script will use less memory, and each time the memory will be freed when the script will be finished.
Get rid of Doctrine it balloons by himself the memory, PDO is much faster and less costly.
For that kind of task where objects in memory could lead to that leaks, you should use Doctrine2 iterators
$appFiles = $this
->em
->getRepository('AppBundle:ApplicationFile')
->findBy(array('application' => $application));
should be refactored in order to return a query object and not an ArrayCollection, then from that query object, you can easily call iterate() method and clean the memory after every object inspection
Edit
You have "hidden references" because detach operation will not delete the object in memory, it only tells to EntityManager not to handle it anymore. This is why you should use my solution or unset() the object with php function.

ABAP "SUPPLY": how to use a data-providing function module?

In the SUPPLY automatically-generated function modules, there can be seen the following comments:
* General Notes
* =============
* A common scenario for a supply method is to aquire key
* informations from the parameter <parent_element> and then
* to invoke a data provider.
* A free navigation thru the context, especially to nodes on
* the same or deeper hierachical level is strongly discouraged,
* because such a strategy may easily lead to unresolvable
* situations!!
*
** data declaration
* DATA lt_nod TYPE wd_this->Elements_nod.
* DATA ls_nod LIKE LINE OF lt_nod.
** #TODO compute values
** e.g. call a data providing **FuBa**
I understand the dangers of navigating through nodes that have an associated Supply Function but haven't been initialized yet - this basically leads to dead locks.
What i'd like to know is what's a FuBa, or data provider and how to use that - all the examples i've found only supply data for a node in a trivial manner, and don't tackle this problem.
Is that some way to register the nodes to be updated later... or... dunno ?
In this case, data provider is not a technical term, it's just some coding that provides the data you want to add to the context. Whatever that may be depends on your application context - anything from a local or remote function module or method call, a call to your assistance class or even - if you really want to adopt bad coding habits - to a direct database access.
FuBa is an abbreviation of Funktionsbaustein = function module.

Resources