Axon EventHandlers in TrackingEventProcessors were not invoked - axon

A few days ago a client mentioned us an entity couldn't be found, although our system returned a 202, meaning the request was processed successfully.
The request was processed successfully as events caused by the command due to the request can be found in the EventStore.
What's so weird about this issue is all the tracking processors were not invoked on the events caused by the first request.
On subsequent events the tracking processors were throwing exceptions as the entity couldn't be found in the database.
So the eventhandlers for the events from the first command were not invoked, but for subsequent commands, events were invoked.
What could have been possibly gone wrong with the events from the first command, as the tracking processors were not invoked?
Our setup is running 2 axon instances connected to 1 PostgreSQL database.
The processors are backed by JPA and JDBC repositories.
Below a simplified snapshot of the events in the EventStore:
global_index
payload_type
timestamp
aggregate_id
sequence_number
my conclusions
8219359
Aggregate1CreatedEventDueToRequest1
2021-02-08T16:25:33.763048105Z
1
0
8235842
Aggregate2CreatedEventDueToRequest0
2021-02-08T17:12:28.036313918Z
2
0
Not handled by TPs
8235843
Aggregate2Event2DueToRequest1
2021-02-08T17:12:28.036338611Z
2
1
Not handled by TPs, but handles by Subscribing Processor
8235844
Aggregate2Event3DueToRequest1
2021-02-08T17:12:28.03635331Z
2
2
Not handled by TPs
8235845
Aggregate2Event4DueToRequest1
2021-02-08T17:12:28.036372834Z
2
3
Not handled by TPs
8235847
Aggregate1Event2DueToAggregate2Event2DueToRequest1
2021-02-08T17:12:28.045952574Z
1
1
8235848
Aggregate1Event3DueToAggregate2Event2DueToRequest1
2021-02-08T17:12:28.04599099Z
1
2
8235849
Aggregate1Event4DueToAggregate2Event2DueToRequest1
2021-02-08T17:12:28.04608497Z
1
3
8235850
Aggregate2Event5DueToRequest2
2021-02-08T17:12:28.058377074Z
2
4
Handled by TPs, but exceptions as previous events not handled

Related

Polling multiple SQS messages using Airflow SQSSensor

I am using this SQSSensoe settings to poll messages
fetch_sqs_message = SQSSensor(
task_id="...",
sqs_queue="...",
aws_conn_id="aws_default",
max_messages=10,
wait_time_seconds=30,
poke_interval=60,
timeout=300,
dag=dag
)
I would assume everytime it polls it should poll up to 10 messages. Which my queue has around 5 when I tested this.
But each time I trigger the dag, it only polls 1 message at a time, which I found out from the SQS message count.
Why is it doing this? How can I to get it poll as much messages as possible?
Recently, a new feature has been added to SQSSensor so that the sensor can polls SQS multiple times instead of only once.
You can check out this merged PR
For example, if num_batches is set to 3, SQSSensor will poll the queue 3 times before returning the results.
Disclaimer: I contributed to this feature.

Design advice on Process, parallelly large volume files

I am looking for design advise on below use case.
I am designing an application which can process EXCEL/CSV/JSON files. They all contain
same columns/attributes. There are about 72 columns/attributes. These files may contain up to 1 million records.
Now i have two options to process those files.
Option 1
Service 1: Read the content from given file, convert each row into JSON save the records into SQL table by batch processing (3k records per batch).
Service 2: Fetch those JSON records from database table (which are saved in step 1), process (validation and calculation) them and save final results into separate table.
Option 2 (using Rabbit MQ)
Service 1: Read the content from given file, and send every row as a message into Queue. Let say if file contains 1 million records then this service will be sending 1 million messages into Queue.
Service 2: Listen to Queue created in step 1, and process those messages (Validation and calculation) and save final results into separate table.
POC experience with Option 1:
It took 5 minutes to read and batch saving the data into table for 100K records. (job of service 1)
If application is trying to process multiple files parallelly which contain 200K records in each of them some times i am seeing deadlocks.
No indexes or relation ships are created on this batch processing table.
Saving 3000 records per batch to avoid table locks.
While services are processing, results are trackable and query the progress. Let say, For "File 1.JSON" - 50000 records are processed successfully and remaining 1000 are IN progress.
If Service 1 finish the job correctly and if something goes wrong with service 2 then we still have better control to reprocess those records as they are persisted in the database.
I am planning to delete the data in batch processing table with a nightly SQL job if all records are already processed by service 2 so this table will be fresh and ready to store the data for the next day processing.
POC experience with option 2:
To produce (service 1) and consume messages (service 2) for 100k record file it took around 2 hr 30 mins.
No storage of file data into the database so no deadlocks (like option 1)
Results are not trackable as much as option 1, while services are processing the records. - To share the status with clients who sent the file for processing.
We can see the status of messages on Rabbit MQ management screen for monitoring purpose.
If service 1 partially read the data from a given file and error out due to some issues then there is no chance of roll back already published messages in Rabbit MQ per my knowledge so consumer keep working on those published messages..
I can horizontally scale the application on both of these options to speed up the process.
Per above facts both options have advantages and disadvantages. Is it a good use case to use Rabbit MQ ? Is it advisable to produce and consume millions of records through RabbitMQ ? Is there a better way to deal with this use case apart from these 2 options.
Please advise.
*** I am using .Net Core 5.0 and SQL server 2019. Service 1 and Service 2 are .net core worker services (windows jobs). All tests are done on my local machine and Rabbit MQ is installed on Docker (docker is on my local machine).

Azure Service Bus Topic trigger in Function App- limiting the number of message read

I have a service bus trigger in an Azure function app which reads the messages ( which are in Json format) coming from the subscription. I would like to know if there is a way to limit the number of request processed by Service bus. So for example if my service bus get triggered and it has 20 messages to be processed, I would like only the first 10 to be processed and then next 10. How can I achieve that?
I am asking this because I am doing some manipulation with the received messages, first i creating a list of the information and running some sql query over it in C# and would prefer my code to NOT handle all the messages at once.
you can configure this in the host.json. Here's the documentation:
learn.microsoft.com
Just add this "maxConcurrentCalls": 10 to the messageHandlerOptions, then it will just process 10 messages simultaneously.

asterisk queue_log late COMPLETEAGENT

I am trying to write a wallboard for my asterisk server. This wallboard will process the queue_log file in /var/log/asterisk.
Here is a scenario in question:
1) A customer calls out call center. Let his number be 44556677889900 and our number 8881234567890.
2) The customer enters the queue 210.
3) Agent 1 takes the call.
4) Agent 1 decides that the call should go to another queue. And transfers it to queue 209
5) Agent 2 takes the call.
6) Agent 2 terminates the call after talking with the customer. (When Agent 2 is talking on the phone Agent 1 is idle and available for a new call.
7) Normally Agent 1 ended his call at 4th step, but the log with COMPLETEAGENT appears just now, even the agent is available since 4th step
Here is the output in the queue_log:
1550582529|1550582516.26480|210|NONE|DID|8881234567890 * 1. step*
1550582529|1550582516.26480|210|NONE|ENTERQUEUE||44556677889900|1 * 2. step*
1550582531|1550582516.26480|210|Test Agent 1|CONNECT|2|1550582529.26493|2 3. step
1550582536|1550582536.26498|209|NONE|DID| ** 4. step**
1550582536|1550582536.26498|209|NONE|ENTERQUEUE||9991|1 4. step
1550582539|1550582536.26498|209|Test Agent 2|CONNECT|3|1550582536.26499|2 5. step
1550582543|1550582536.26498|209|Test Agent 2|COMPLETECALLER|3|4|1 6. step
1550582549|1550582516.26480|210|Test 1|COMPLETEAGENT|2|18|1 7. step
As mentioned in the 7th step, Agent 1 if available for new calls after he transfers the call to queue 209. (In fact if a new call comes, the system send the call to Agent 1). However the log "COMPTELEAGENT" appears only when the customer disconnects.
This makes my wallboard think that Agent 1 is busy even he is not. And worse if he received a new call before Agent 2 finishes, everything gets more complicated.
Questions:
1) How it is possible to make the system send the COMPLETEAGENT at step 4 ?
2) Why is ATTENDEDTRANSFER log missing ? (Not related to this problem directly but can also be connected)
Asterisk Version: 13.22.0
Freepbx 14.0.5.25
Thank you in advance.
1) System should not send COMPLEATEAGENT at 4, becuase thoose event should be sent AFTER END of call.
That event is created by QUEUE, not by AGENT. From queue's point of view call not yet finished.
If you want it be finished, do transfer of LEGA, not queue's LEG.
2)Transfer subsystem not related to queue subsystem and SHOULD NOT be related in any realible PBX. You can write your own if you want.
Side notes
no point parse queue_log, much simpler setup queue_log in mysql or other db and read it.
you can write your own queue system using Async AGI.
you can add as many logs as you want by using dialplan CEL or UserEvents.

Diffrence b/w "Hangup" and "Remote end Busy" in asterisk while calling through call file?

I am trying to make a call through call file it works fine but I was trying to check all the possible message displayed in cli when like user hangs the call,call to switch off number,if user is busy to other call.In case If user cuts the call the status is sometimes hangup and sometime busy .Exact message is given below
Call Failed To Go through, reason (5) Remote end is Busy
Call Failed To Go through, reason (1) Hangup
I don't know what the reason is as it display different messages everytime? and where to find meaning of error code (5) and (1) so I can look into details.
Interestingly enough, the reason codes returned back for call files are not the same as the canonical Asterisk hangup cause codes. Instead, most likely for historical compatibility reasons, call files use their own mechanism for what happened to a call. In this case, that would be:
0 - "Call Failure (not BUSY, and not NO_ANSWER, maybe Circuit busy or down?)"
1 - "Hangup"
2 - "Local Ring"
3 - "Remote end Ringing"
4 - "Remote end has Answered"
5 - "Remote end is Busy"
8 - "Congestion (circuits busy)"
(any other value) - "Unknown"
The interpretation of these should mostly be:
1 or 4 - the call was answered by the remote party
2 or 3 - the call was terminated by the initiator before the call was answered
5 - the remote end was busy
8 - the remote end was congested
0 or any other value - something bad happened to the call

Resources