Preferred way to specify operation timeout

Preferred way to specify operation timeout - rxandroidble

What would be the recommended way of specifying a timeout for a connection or operation? Currently I use:
ConnectionObservable = mDevice
.establishConnection(mRxAppCompatActivity, false)
.timeout(DEFAULT_TIMEOUT_IN_MILLIS, TimeUnit.MILLISECONDS)
But I get the impression that occasionally the subsequent automatic closing and disconnecting of the gatt is not always done properly, as I sometimes have trouble reconnecting to the same device after that.
Would something like
.takeUntil(disconnectTrigger)
with disconnectTrigger.onNext() being triggered manually after the timeout be more recommended?

The .establishConnection() function when called with autoConnect = false will automatically fail after about 30 seconds.
Both options for disconnecting are equally acceptable. Usage of a particular solution is depending on the use-case in my opinion.
Note that those timeouts will be not affect much the operations that are already started - only exception is the connection operation can be cancelled but this functionality was merged into the master branch recently. (https://github.com/Polidea/RxAndroidBle/commit/604853c4f39c5e8a19e02415c50b547b0befd0e7) The operations that are not yet processed will be removed from the operations queue.
I think that you may be suffering from a broken disconnection mechanism (it's hard to be sure with no logs). There is already a bug for this issue here: https://github.com/Polidea/RxAndroidBle/issues/63
Best Regards

Related

Axon Event Processing Timeout

I am using an Axon Event Tracking processor. Sometimes events take longer that 10 seconds to process.
This seems to cause the message to be processed again and this appears in the log "Releasing claim of token X/0 failed. It was owned by another node."
If I up the number of segments it does not log this BUT the event is still processed twice so I think this might be misleading. (I think I was mistaken about this)
I have tried adjusting the fetchDelay, cleanupDelay and tokenClaimInterval. None of which has fixed this. Is there a property or something that I am missing?
Edit
The scenario taking longer than 10 seconds is making a HTTP request to an external service.
I'm using axon 4.1.2 with all default configuration when using with Spring auto configuration. I cannot see the Releasing claim on token and preparing for retry in [timeout]s log.
I was having this issue with a single segment and 2 instances of the application. I realised I hadn't increased the number of segments like I thought I had.
After further investigation I have discovered that adding an additional segment seems to have stopped this. Even if I have for example 2 segments and 6 applications it still doesn't reappear, however I'm not sure how this is different to my original scenario of 1 segment and 2 application?
I didn't realise it would be possible for multiple threads to grab the same tracking token and process the same event. It sounds like the best action would be to put an idem-potency check before the HTTP call?

The Releasing claim of token [event-processor-name]/[segment-id] failed. It was owned by another node. message can only occur in three scenarios:
You are performing a merge operation of two segments which fails because the given thread doesn't own both segments.
The main event processing loop of the TrackingEventProcessor is stopped, but releasing the token claim fails because the token is already claimed by another thread.
The main event processing loop has caught an Exception, making it retry with a exponential back-off, and it tries to release the claim (which might fail with the given message).
I am guessing it's not options 1 and 2, so that would leave us with option 3. This should also mean you are seeing other WARN level messages, like:
Releasing claim on token and preparing for retry in [timeout]s
Would you be able to share whether that's the case? That way we can pinpoint a little better what the exact problem is you are encountering.
By the way, very likely you have several processes (event handling threads of the TrackingEventProcessor) stealing the TrackingToken from one another. As they're stealing an un-updated token, both (or more) will handled the same event. Hence why you see the event handler being invoked twice.
Obviously undesirable behavior and something we should resolve for you. I would like to ask you to provide answers to my comments under the question, as right now I have to little to go on. Let us figure this out #Dan!
Update
Thanks for updating your question #dan, that's very helpful.
From what you've shared, I am fairly confident that both instances are stealing the token from one another. This does depend though on whether both are using the same database for the token_entry table (although I am assuming they are).
If they are using the same table, then they should "nicely" share their work, unless one of them takes to long. If it takes to long, the token will be claimed by another process. This other process in this case is the thread of the TEP of your other application instance. The "claim timeout" is defaulted to 10 seconds, which also corresponds with the long running event handling process.
This claimTimeout is adjustable though, by invoking the Builder of the JpaTokenStore/JdbcTokenStore (depending on which you are using / auto wiring) and calling the JpaTokenStore.Builder#claimTimeout(TemporalAmount) method. And, I think this would be required on your end, giving the fact you have a long running operation.
There are of course different ways of tackling this. Like, making sure the TEP is only ran on a single instance (not really fault tolerant though), or offloading this long running operation to a schedule task which is triggered by the event.
But, I think we've found the issue at least, so I'd suggest to tweak the claimTimeout and see if the problem persists.
Let us know if this resolves the problem on your end #dan!

How to kill a thread, stop a promise execution in Raku

I am looking for a wait to stop (send an exception) to a running promise on SIGINT. The examples given in the doc exit the whole process and not just one worker.
Does someone know how to "kill", "unschedule", "stop" a running thread ?
This is for a p6-jupyter-kernel issue or this REPL issue.
Current solution is restarting the repl but not killing the blocked thread
await Promise.anyof(
start {
ENTER $running = True;
LEAVE $running = False;
CATCH {
say $_;
reset;
}
$output :=
self.repl-eval($code,:outer_ctx($!save_ctx),|%adverbs);
},
$ctrl-c
);

Short version: don't use threads for this, use processes. Killing the running process probably is the best thing that can be achieved in this situation in general.
Long answer: first, it's helpful to clear up a little confusion in the question.
First of all, there's no such thing as a "running Promise"; a Promise is a data structure for conveying a result of an asynchronous operation. A start block is really doing three things:
Creating a Promise (which it evaluates to)
Scheduling some code to run
Arranging that the outcome of running that code is reflected by keeping or breaking the Promise
That may sound a little academic, but really matters: a Promise has no awareness of what will ultimately end up keeping or breaking it.
Second, a start block is not - at least with the built-in scheduler - backed by a thread, but rather runs on the thread pool. Even if you could figure out a way to "take out" the thread, the thread pool scheduler is not going to be happy with having one of the threads it expects to eat from the work queue on disappear. You could write your own scheduler that really does back work with a fresh thread each time, but that still isn't a complete solution: what if the piece of code the user has requested execution of schedules work of its own, and then awaits that? Then there is no one thread to kill to really bring things to a halt.
Let's assume, however, that we did manage to solve all of this, and we get ourselves a list of one or more threads that we really want to kill without their cooperation (cooperative situations are fairly easy; we use a Promise and have code poll that every so often and die if that cancellation Promise is ever kept/broken).
Any such mechanism that wants to be able to stop a thread blocked on anything (not just compute, but also I/O, locking, etc.) would need deep integration and cooperation from the underlying runtime (such as MoarVM). For example, trying to cancel a thread that is currently performing garbage collection will be a disaster (most likely deadlocking the VM as a whole). Other unfortunate cancellation times could lead to memory corruption if it was half way through an operation that is not safe to interrupt, deadlocks elsewhere if the killed thread was holding locks, and so forth. Thus one would need some kind of safe-pointing mechanism. (We already have things along those lines in MoarVM to know when it's safe to GC, however cancellation implies different demands. It probably cross-cuts numerous parts of the VM codebase.)
And that's not all: the same situation repeats at the Raku language level too. Lock::Async, for example, is not a kind of lock that the underlying runtime is aware of. Probably the best one can do is try to tear down the callstack and run all of the LEAVE phasers; that way there's some hope (if folks used the .protect method; if they just called lock and unlock explicitly, we're done for). But even if we manage not to leak resources (already a big ask), we still don't know - in general - if the code we killed has left the world in any kind of consistent state. In a REPL context this could lead to dubious outcomes in follow-up executions that access the same global state. That's probably annoying, but what really frightens me is folks using such a cancellation mechanism in a production system - which they will if we implement it.
So, effectively, implementing such a feature would entail doing a significant amount of difficult work on the runtime and Rakudo itself, and the result would be a huge footgun (I've not even enumerated all the things that could go wrong, just the first few that came to mind). By contrast, killing a process clears up all resources, and a process has its own memory space, so there's no consistency worries either.

There is currently no way to stop a thread if it doesn't want to be stopped.
A thread can check a flag every so often, and decide to call it quits if that flag is set. It would be very nice if we would have a way to throw an exception inside a thread from another thread. But we do not, at least not as far as I know.

MQSeries: Is syncpoint/rollback possible when getting asynchronously with MCB?

I want to pull messages off a MQS queue in a C client, and would love to do so asynchronously so I don't have to start (explicitly) multithreading. The messages will be forwarded to another system that acts "transactionally" but is completely incompatible with XA. So I'd like to have a way to explicitly commit (and thereby remove) a message that's been successfully handed off to the other system, and not commit if this failed, so that the last message is retained for a more successful later attempt.
I've read about the SYNCPOINT option and understand how I'd use that around a regular GET, but I haven's seen any hints on how to make asynchronous message retrieval have transactional behavior like this. Any hints, please?

I think you are describing using the asynchronous callback capability, ie you register a routine to be called when a message arrives, and ask for any get to be under syncpoint... An explanation of how some of it works is in here, https://share.confex.com/share/117/webprogram/Handout/Session9513/share_advanced_mqi.pdf page 4+
Effectively you get called with the MQ message under syncpoint, do your processing with another system, then commit or rollback the message before returning.
Be aware without the use of e.g. XA 2 phase commit, there is always going to be the windows of e.g. committing to the external system and a power outage means the message under the unit of work gets rolled back inside MQ as you didnt have time to perform the commit.

Edit: my misunderstanding, didn't realise that the application was using a callback to retrieve messages, which is indeed fully asynchronous behavior. Disregard the answer below.
Do MQGET with MQGMO_SYNCPOINT, then issue either MQCMIT or MQBACK.
"Asynchronous" and "synchronous" may be misnomers - these are your patterns of using MQ - whether you wait for a reply message or not, these patterns do not affect how MQ processes your calls. Transaction management (unit of work management) works across any MQI calls that use SYNCPOINT, no matter if they are part of a request/reply pattern or not.

Managing Firebase's concurrent users

I have a few microservices (AWS Lambda) which when started will authenticate with Firebase, change state in an appropriate manner, and then stop execution.
I have found out tonight though that I must not be cleaning up after myself as I have 100 concurrent users and it's only me doing some testing work (albeit testing authentication).
I've seen references to the goOffline() method and possibly could use that but it seems a little hacky. Are there any good examples of how one can ensure that you're closing and/or reusing an existing connection?

When spinning up server-side processes that use Firebase it's vital that you make sure the process is completed when killed.
If the process is killed, then the connection will get killed as well. But, there's nothing hacky about calling Firebase.goOffline(), as it will close the connection.

How do I specify a redelivery policy and separate retry queue processor in Rebus

I'm currently investigating Rebus but being unable to find good documentation this process is proving difficult. I am hoping someone can help me understand this exciting product.
I have read that during message processing, if something goes wrong the message will return to the queue.
Is the message returned to the front of the queue or placed on the end? If placed on the front this will be problem because the queue in essence becomes blocked with a message that may not be able to be processed - at least until it times out or retries exceeded.
Does Rebus have support for an out-of-the-box separate Retry queue?
Can I specify the interval between retries?
Can I specify an exponential backoff interval for retries as in Apache ActiveMQ?
Thanks

1) The queue transaction is rolled back, effectively moving the message back in front - therefore, it will be immediately retried.
After 5 failed attempts (at least that is the default), Rebus will move the message to the error queue. The default retry mechanism is intentionally very swift - this way, the input queue will never be clogged by poisonous messages.
If you need more sophisticated retries, I suggest you tage a look at bus.Defer - it can defer delivery of a message to the future. It requires that you have a timeout manager(*) running though.
2) I guess that's what I call "error queue", except there's no retry :)
I did create a solution some time, though, where I coded a simple endpoint that would periodically empty the error queue and move all the messages back into the original source queue, as a form of crude automatic second-level retry mechanism.
3) No. NServiceBus has the concept of second-level retries, but this is something that I've never really needed (enough) with Rebus. But with Rebus, you're on your own here - it should be fairly easy to do some intelligent bus.Defer that can then be easily adapted to each kind of error that you're expecting.
4) See (3)
I hope that clarifies a bit :)
(*) The timeout manager can be a separate endpoint whose only job in life is to receive a message, hold on to it for a while (i.e. save it to a database), and then return it to the sender when the time has elapsed. The timeout manager can be hosted in-process though, but using the .Timeouts(t => t.???) configuration spell.