Azure cosmos changefeed Processor options - azure-cosmosdb

Changefeed Processor options are well described here -
I have few questions on that -
leaseRenewInterval: Suppose an instance could not renew its lease within 17s (default lease renew interval), will the lease be removed from that instance? Or feed will wait till leaseExpirationInterval to remove the lease from it and give it a chance to reacquire lease within 60s?
Will leaseRenew by default happens after checkpoint, or both are independent? i.e. leaseRenew can happen on separate thread after leaserenewinterval, while other thread is still working on a batch?
We have seen the error: failed to checkpoint for owner 'null' with continuation token. How this can happen? Why owner can become null?
We have also seen the exception LeaseLostException. Can this happen even if the pod/instance is not down? We are not expecting any load balance as only 1 physical partition is there, but want our system to be fault tolerant, so we do have multiple instances running where all other except 1, will always wait for lease to acquire.
There are few instances where we can see, at the same time, 3 pods/instance having lease of same physical partition, or we can say, they acquired same lease. (We can have at max 1 Physical Partition, (TTL for document is 3 days and storage is less, so we are not expecting more than 1 physical partition)). How this can happen?
EDITS:
Current Settings:
leaseRenewInterval : 17s
leaseAcquireInterval: 13s
leaseExpirationInterval: 60s
feedPollDelay: 2s [only this is not the default]
ChangeFeed Processor version:
We are using below in our maven
<dependency>
<groupId>com.azure</groupId>
<artifactId>azure-cosmos</artifactId>
<version>4.8.0</version>
</dependency>
So, I can assume the CFP version is 4.8.0

Leases when not renewed are not removed by the current instance. Other instances can "think" that the lease was not renewed because the current owner crashed, so they will "steal" them. Normally happens when the lease is not accessed/updated before the expiration time.
Independent. There could be no checkpoints (no new changes) and lease still would get renewed.
That sounds like the lease was released and then attempted to checkpoint. Not sure which CFP version you are using or which is your interval configurations.
Are you customizing any of the intervals? If so, that could lead to a lease being lost (detected as expired by other instance).
Same question as before, this could happen either during load balancing or because leases are being detected expired.
Please share which CFP version you are using and what are the options. Normally, unless you are very certain what you are doing, I don't recommend changing any of the intervals.
EDIT: Based on the new information. I am not familiar with the Java CFP, but when the number of instances is higher than leases, load balancing a lease across other instances while not ideal, shouldn't be a problem, because the lease will still be owned and processed by 1 machine. The only recommendation I'd try is to use the latest maven package version. There are fixes on CFP on newer version (https://learn.microsoft.com/en-us/azure/cosmos-db/sql-api-sdk-java-v4#4140-2021-04-06), so try 4.15.0.

Related

Reducing replication on Openstack Swift

Is it possible to limit replication on OpenStack Swift to one or possibly 2? 3 data replicas are currently being created.
The default replication in Swift is 3 replicas, but you can change that to any number greater or equal to 1. The swift-ring-builder command has a set-replicas verb for adjusting the number of replicas; see
https://docs.openstack.org/swift/xena/admin/objectstorage-ringbuilder.html#replica-counts
Of course, there is a tradeoff between the number of replicas and your cluster's ability to cope with loss of disk drives and servers.
You also asked this:
If I don't want replication on Openstack Swift(replication equal to 1), can I achieve this by disabling replicators? sudo swift-init object-replicator stop If I may, is this a better option?
I guess it would probably work ... until someone or something restarts the replicators.
But it is not a good idea:
Monitoring may complain that the replicators are not working(!)
Other swift tools may complain that the cluster is out of spec; e.g. swift-recon --replication.
If you have been previously running with (say) a 3 replica policy, then nothing will delete the (now) redundant replicas created before you turned off the replicators.

when will Cosmos DB Client's mapping cache be refreshed?

We are using Cosmos DB SDK whose version is 2.9.2. We perform Document CRUD operations. Usually, the end-to-end P95 latency is 20ms. But sometimes the latency is over 1000ms. The high latency period lasts for 10 hours to 1 day. The collection is not throttling.
We have get some background information from:
https://icm.ad.msft.net/imp/v3/incidents/details/171243015/home
https://icm.ad.msft.net/imp/v3/incidents/details/168242283/home
There are some diagnostics strings in the tickets.
We know that the client maintains a cache of the mapping of logical partition and physical replica address. This mapping may be outdated because of replicas movement or outage. So client tries to read from the second/third replica. However, this retry has significant impact on end to end latency. We also observe that the high latency/timeout can last for several hours, even days. I expect there’s some mechanism of refreshing mapping cache in the client. But it seems the client stops visiting more than one replica only after we redeploy our service.
Here are my questions:
How can the client tell whether it’s unable to connect to a certain replica? Will the client wait until timeout or server tells client that the replica is unavailable?
In which condition the mapping cache will be refreshed? We are using Session consistency and TCP mode.
Will restarting our service force the cache to be refreshed? Or refreshing only happens when the machine restarts?
When we find there’s replica outage, is there any way to quickly mitigate?
What operations are performed (Document CRUD or query)?
And what are the observed latencies & frequencies? Also please check if the collection is throttling (with custom throttling policy).
Client do manage the some metada and does handle its staleness efficiently with-in SLA bounds.
Can you please create a support ticket with account details and 'RequestDiagnostis' and we shall look into it.

Corda Node getting JVM OutOfMemory Exception when trying to load data

Background:
We are trying to load data in our Custom CorDapp(Corda-3.1) using jmeter.
Our CorDapp is distributed across Six nodes(Three Parties, Two Notaries and One Oracle).
The Flow being executed in order to load data is having very minimal business logic, has three participants and requires two parties to sign the transaction.
Below is the environment, configuration and test details:
Server: Ubuntu 16.04
Ram: 8GB
Memory allocation to Corda.jar: 4GB
Memory allocation to Corda-webserver.jar : 1GB
JMeter Configuration- Threads(Users): 20 (1 transaction per second per thread)
Result:
Node B crashed after approx 21000 successful transactions(in approximately 3 hours and 30 mins) with "java.lang.OutOfMemoryError: Java heap space". After some time other nodes crashed due to continuous "handshake error" with Node B.
We analyzed heap dump using Eclipse MAT and found out that more than 21000 instances of hibernate SessionFactoryImpl were created which occupied more than 85% of the memory on Node B.
We need to understand why Corda network is creating so many objects and persisting them in memory.
We are continuing our investigation as we are not 100% sure if this is entirely Corda bug.
Solution to the problem is critical in our pursuit to continue further tests.
Note - We have more details about our investigation but we are unable to attach them here but can send over email.
If you're developing in Java, it is likely that the issue you're encountering has already been fixed by https://r3-cev.atlassian.net/browse/CORDA-1411
The fix is not available in Corda 3.1 yet, but the JIRA ticket provides a workaround. You need to override equals and hashCode on any subclasses of MappedSchema that you've defined. This should fix the issue you're observing.

Very slow Riak writes and this error: {shutdown,max_concurrency}

On a 5-node Riak cluster, we have observed very slow writes - about 2 docs per second. Upon investigation, I noticed that some of the nodes were low on disk space. After making more space available and restarting the nodes, we are see this error (or something similar) on all of the nodes inside console.log:
2015-02-20 16:16:29.694 [info] <0.161.0>#riak_core_handoff_manager:handle_info:282 An outbound handoff of partition riak_kv_vnode 182687704666362864775460604089535377456991567872 was terminated for reason: {shutdown,max_concurrency}
Currently, the cluster is not being written to or read from.
I would appreciate any help in getting the cluster back to good health.
I will add that we are writing documents to an index that is tied to a Solr index.
This is not critical production data, and I could technically wipe everything and start fresh, but I would like to properly diagnose and fix the issue so that I am prepared to handle it if it should happen in a production environment in the future.
Thanks!

Forcing a Biztalk Host to Throttle for Debugging Purposes

we're currently having issue in our production servers and would like to try to replicate the issue in our dev. I'm currently awaiting access to our Performance Monitoring Tool, and while waiting would like to play with it a little.
I'm thinking of, since I suspect a host throttling in prod, forcing hosts to throttle in dev and see if it will recreate the issue.
Is there a way to do this?
As others have mentioned, monitoring of the throttling counters and other counters like memory and WIP messages is a must to see what is going on in your production server. Also would recommend that set up a SCOM alert on throttling states of 3+ (publishing + delivery states), if you have SCOM.
Message throughput can grind to a halt on especially the memory (4, 5) and Queue Size (6) states. States 1+2 are generally short lived (e.g. arrival of a large batch of messages) and Biztalk recovers within a few seconds.
Simulating the memory state in your Dev environment should be straightforward by tweaking the throttling thresholds (obviously not something to be taken lightly in production!)
e.g. to trigger the Memory threshold states - AFAIK the lowest memory usage threshold you can set is 101MB. Running a load test in dev should then be able reproduce the throttle.
There is also apparantly a user-based throttling override to set states 10 and 11 although haven't actually tried this.
Some other experience on avoiding throttling:
(Caveat - I don't have an active BizTalk 2006/R2 setup - this is for 2009 / 2010)
If you do a lot of asynchronous processing (e.g. Queue receives), ensure that you have split functionality into separate Hosts for Receive, Processing and Send hosts. This way you can adjust the throttling for asynch Receive hosts to trigger much earlier than the processing and sending hosts - this should have the effect of constricting new incoming messages to the messagebox but allowing existing messages to complete processing.
On 64 bit hosts, the default 25% memory host usage throttling level is usually an unnecessary liability - we increased this using Yossi Dahan's recommendation of 50% on a 4GB server
Note that suspended messages count toward throttling state 6 - ensure that you have a strategy for dealing with suspended messages (and obviously ensure that the Sql Agent jobs are running!).

Resources