Azure Cosmos Java SDK overriding default retry values? - azure-cosmosdb

I am using Azure Cosmos SDK Version 4.10.0. I would like to know is there a way to override the retry values for Exception other than throttling Exception. I know that throttled requests (429) can be override using ThrottlingRetryOptions.
Some other exceptions which have retries are
Network Failures: Max retries - 120
GoneExceptions
PartitionIsMigratingException
Default values:
Backoff Constants
private final static int DEFAULT_WAIT_TIME_IN_SECONDS = 30;
//Note: Wait time in seconds after which the exception is logged with following warning: ""Received {} after backoff/retry. Will fail the request.{exception}". No further retries are attempted after this duration.
private final static int MAXIMUM_BACKOFF_TIME_IN_SECONDS = 15;
//Note: Maximum time in seconds a backoff occurs
private final static int INITIAL_BACKOFF_TIME = 1; // in seconds
private final static int BACK_OFF_MULTIPLIER = 2;
We don't want our client to wait for 30 seconds due to a network issue and return a failure say after 5 seconds.

Java SDK about request timeout and retry options:

Related

How to retry failed ConsumerRecord in reactor-kafka

I am trying on reactor-kafka for consuming messages. Everything else work fine, but I want to add a retry(2) for failing messages. spring-kafka already retries failed record 3 times by default, I want to achieve the same using reactor-kafka.
I am using spring-kafka as a wrapper for reactive-kafka. Below is my consumer template:
reactiveKafkaConsumerTemplate
.receiveAutoAck()
.map(ConsumerRecord::value)
.flatMap(this::consumeWithRetry)
.onErrorContinue((error, value)->log.error("something bad happened while consuming : {}", error.getMessage()))
.retryWhen(Retry.backoff(30, Duration.of(10, ChronoUnit.SECONDS)))
.subscribe();
Let us consider the consume method is as follows
public Mono<Void> consume(MessageRecord message){
return Mono.error(new RuntimeException("test retry"); //sample error scenario
}
I am using the following logic to retry the consume method on failure.
public Mono<Void> consumeWithRetry(MessageRecord message){
return consume(message)
.retry(2);
}
I want to retry consuming the message if the current consumer record fails with exception. I have tried to wrap the consume method with another retry(3) but that does not serve the purpose. The last retryWhen is only for retrying subscription on kafka rebalances.
#simon-baslé #gary-russell
Previously while retrying I was using the below approach:
public Mono<Void> consumeWithRetry(MessageRecord message){
return consume(message)
.retry(2);
}
But it was not retrying. After adding Mono.defer, the above code works and adds required retry.
public Mono<Void> consumeWithRetry(MessageRecord message){
return Mono.defer(()->consume(message))
.retry(2);
}

StackExchange.Redis ConnectionMultiplexer pool for synchronous methods

Does implementing a ConnectionMultiplexer pool make sense if we use it for synchronous methods?
So by pool I mean creating multiple instances of StackExchange.Redis ConnectionMultiplexer, storing those objects and when I want to communicate with Redis server I take the least used one from the pool. This is to prevent timeouts due to large queue size as per No. 10 suggestion in this article: https://azure.microsoft.com/en-us/blog/investigating-timeout-exceptions-in-stackexchange-redis-for-azure-redis-cache/
I'm having doubts because I'm not sure how can a queue even happen if connectionMultiplexer blocks a thread until a call returns.
It seems to me that having a pool is pointless with sync method calls, but Redis best practice articles suggest creating this kind of pool regardless of method type (sync/async)
I think you're getting confused here. ConnectionMultiplexer does not "get blocked". Creating a ConnectionMultiplexer gives you a factory-like object with which you can create IDatabase instances. You then use these instances to perform normal Redis queries. You can also do Redis queries with the connection multiplexer itself, but those are server queries and unlikely to be done often.
So, to make things short, it can help tremendously to have a pool of connection multiplexers, regardless of sync/async/mixed usage.
To expand further, here's a very simple pool implementation, which can certainly be enhanced further:
public interface IConnectionMultiplexerPool
{
Task<IDatabase> GetDatabaseAsync();
}
public class ConnectionMultiplexerPool : IConnectionMultiplexerPool
{
private readonly ConnectionMultiplexer[] _pool;
private readonly ConfigurationOptions _redisConfigurationOptions;
public ConnectionMultiplexerPool(int poolSize, string connectionString) : this(poolSize, ConfigurationOptions.Parse(connectionString))
{
}
public ConnectionMultiplexerPool(int poolSize, ConfigurationOptions redisConfigurationOptions)
{
_pool = new ConnectionMultiplexer[poolSize];
_redisConfigurationOptions = redisConfigurationOptions;
}
public async Task<IDatabase> GetDatabaseAsync()
{
var leastPendingTasks = long.MaxValue;
IDatabase leastPendingDatabase = null;
for (int i = 0; i < _pool.Length; i++)
{
var connection = _pool[i];
if (connection == null)
{
_pool[i] = await ConnectionMultiplexer.ConnectAsync(_redisConfigurationOptions);
return _pool[i].GetDatabase();
}
var pending = connection.GetCounters().TotalOutstanding;
if (pending < leastPendingTasks)
{
leastPendingTasks = pending;
leastPendingDatabase = connection.GetDatabase();
}
}
return leastPendingDatabase;
}
}

MassTransit + Azure Service Bus: Default TTL configuration for topics and other questions

First of all, excuse my English, it's very bad. I am using MassTransit with Azure Service Bus for asynchronous communication between microservices. I have some doubts about the default configuration of masstransit together with azure service bus, but I have not been able to clarify them in the documentation. These doubts are the following:
Is there a way to set a default TTL for all queues and topics? And a size for the queues and topics? The documentation specifies that the TTL and the size of the queues can be adjusted when the receiver is connected / created, but on the other hand, regarding the topics, it is specified how to adjust the TTL if the consumer connects to a specific topic, however in this case our consumers do not subscribe to a specific topic, but simply to a messageType, and Masstransit does the rest (creates the topic, the subscriber and forwards the message from the subscriber to the consumer's queue).
On the other hand, what are the default values of PrefetchCount and MaxConcurrentCalls?
And some optimal values to obtain the highest performance considering that consumers have horizontal scaling (competing consumer)? From what I have read in other questions, the PrefetchCount and MaxConcurrentCalls values must be close to each other to optimize performance, is that correct?
Thank you very much.
Regards
The defaults for all Azure Service Bus entities are in this file and can be adjusted prior to bus creation.
The defaults include (edited):
namespace MassTransit.Azure.ServiceBus.Core
{
public static class Defaults
{
public static TimeSpan LockDuration = TimeSpan.FromMinutes(5);
public static TimeSpan DefaultMessageTimeToLive = TimeSpan.FromDays(365 + 1);
public static TimeSpan BasicMessageTimeToLive = TimeSpan.FromDays(14);
public static TimeSpan AutoDeleteOnIdle => TimeSpan.FromDays(427);
public static TimeSpan TemporaryAutoDeleteOnIdle => TimeSpan.FromMinutes(5);
public static TimeSpan MaxAutoRenewDuration => TimeSpan.FromMinutes(5);
public static TimeSpan MessageWaitTimeout => TimeSpan.FromSeconds(10);
public static TimeSpan ShutdownTimeout { get; set; } = TimeSpan.FromMilliseconds(100);
public static int MaxConcurrentCalls => Math.Max(Environment.ProcessorCount, 8);
public static int PrefetchCount => Math.Max(MaxConcurrentCalls, 32);
}
}
As for optimizing the PrefetchCount/MaxConcurrentCalls settings, they should typically be matched for competing consumer with multiple instances to distribute the load.

How to run an async task daily in a Kestrel process?

How do I run an async task in a Kestrel process with a very long time interval (say daily or perhaps even longer)? The task needs to run in the memory space of the web server process to update some global variables that slowly go out of date.
Bad answers:
Trying to use an OS scheduler is a poor plan.
Calling await from a controller is not acceptable. The task is slow.
The delay is too long for Task.Delay() (about 16 hours or so and Task.Delay will throw).
HangFire, etc. make no sense here. It's an in-memory job that doesn't care about anything in the database. Also, we can't call the database without a user context (from a logged-in user hitting some controller) anyway.
System.Threading.Timer. It's reentrant.
Bonus:
The task is idempotent. Old runs are completely irrelevant.
It doesn't matter if a particular page render misses the change; the next one will get it soon enough.
As this is a Kestrel server we're not really worried about stopping the background task. It'll stop when the server process goes down anyway.
The task should run once immediately on startup. This should make coordination easier.
Some people are missing this. The method is async. If it wasn't async the problem wouldn't be difficult.
I am going to add an answer to this, because this is the only logical way to accomplish such a thing in ASP.NET Core: an IHostedService implementation.
This is a non-reentrant timer background service that implements IHostedService.
public sealed class MyTimedBackgroundService : IHostedService
{
private const int TimerInterval = 5000; // change this to 24*60*60 to fire off every 24 hours
private Timer _t;
public async Task StartAsync(CancellationToken cancellationToken)
{
// Requirement: "fire" timer method immediatly.
await OnTimerFiredAsync();
// set up a timer to be non-reentrant, fire in 5 seconds
_t = new Timer(async _ => await OnTimerFiredAsync(),
null, TimerInterval, Timeout.Infinite);
}
public Task StopAsync(CancellationToken cancellationToken)
{
_t?.Dispose();
return Task.CompletedTask;
}
private async Task OnTimerFiredAsync()
{
try
{
// do your work here
Debug.WriteLine($"{TimerInterval / 1000} second tick. Simulating heavy I/O bound work");
await Task.Delay(2000);
}
finally
{
// set timer to fire off again
_t?.Change(TimerInterval, Timeout.Infinite);
}
}
}
So, I know we discussed this in comments, but System.Threading.Timer callback method is considered a Event Handler. It is perfectly acceptable to use async void in this case since an exception escaping the method will be raised on a thread pool thread, just the same as if the method was synchronous. You probably should throw a catch in there anyway to log any exceptions.
You brought up timers not being safe at some interval boundary. I looked high and low for that information and could not find it. I have used timers on 24 hour intervals, 2 day intervals, 2 week intervals... I have never had them fail. I have a lot of them running in ASP.NET Core in production servers for years, too. We would have seen it happen by now.
OK, so you still don't trust System.Threading.Timer...
Let's say that, no... There is just no fricken way you are going to use a timer. OK, that's fine... Let's go another route. Let's move from IHostedService to BackgroundService (which is an implementation of IHostedService) and simply count down.
This will alleviate any fears of the timer boundary, and you don't have to worry about async void event handlers. This is also a non-reentrant for free.
public sealed class MyTimedBackgroundService : BackgroundService
{
private const long TimerIntervalSeconds = 5; // change this to 24*60 to fire off every 24 hours
protected override async Task ExecuteAsync(CancellationToken stoppingToken)
{
// Requirement: "fire" timer method immediatly.
await OnTimerFiredAsync(stoppingToken);
var countdown = TimerIntervalSeconds;
while (!stoppingToken.IsCancellationRequested)
{
if (countdown-- <= 0)
{
try
{
await OnTimerFiredAsync(stoppingToken);
}
catch(Exception ex)
{
// TODO: log exception
}
finally
{
countdown = TimerIntervalSeconds;
}
}
await Task.Delay(1000, stoppingToken);
}
}
private async Task OnTimerFiredAsync(CancellationToken stoppingToken)
{
// do your work here
Debug.WriteLine($"{TimerIntervalSeconds} second tick. Simulating heavy I/O bound work");
await Task.Delay(2000);
}
}
A bonus side-effect is you can use long as your interval, allowing you more than 25 days for the event to fire as opposed to Timer which is capped at 25 days.
You would inject either of these as so:
services.AddHostedService<MyTimedBackgroundService>();

Set default timeout in all asynchronous requests

I'm using #Suspended AsyncResponse response in my requests and starting threads to process the request. When the process finishes, I'm trying to resume the response but RestEasy is marking the request as done because the request thread has finished and no timeout was set in the response. If I set timeout, it works fine but I would need to set the timeout in every asynchronous request I want want to implement. Is there anyway to horizontally set the timeout to all my suspended AsyncRequests?
Unfortunately, the JAX-RS 2.0 specification, the RESTEasy documentation and the Jersey documentation don't mention anything about setting a default timeout for the AsyncResponse.
The Jersey documentation mentions the following:
By default, there is no timeout defined on the suspended AsyncResponse instance. A custom timeout and timeout event handler may be defined using setTimeoutHandler(TimeoutHandler) and setTimeout(long, TimeUnit) methods. The setTimeoutHandler(TimeoutHandler) method defines the handler that will be invoked when timeout is reached. The handler resumes the response with the response code 503 (from Response.Status.SERVICE_UNAVAILABLE). A timeout interval can be also defined without specifying a custom timeout handler (using just the setTimeout(long, TimeUnit) method).
So, the solution won't be different from the solution you are already using:
#GET
public void longRunningOperation(#Suspended final AsyncResponse asyncResponse) {
// Register a timeout handler
asyncResponse.setTimeoutHandler(new TimeoutHandler() {
#Override
public void handleTimeout(AsyncResponse asyncResponse) {
asyncResponse.resume(Response.status(SERVICE_UNAVAILABLE)
.entity("Operation timed out. Please try again.").build());
}
});
// Set timeout
asyncResponse.setTimeout(15, SECONDS);
// Execute long running operation in new thread
executor.execute(new Runnable() {
#Override
public void run() {
executeLongRunningOp();
asyncResponse.resume("Hello async world!");
}
});
}

Resources