I/O error on POST request for "https://iid.googleapis.com/iid/v1:batchAdd" : connection reset exeption is javax.net.ssl.SSLException:Connection reset - push-notification

I am using https://iid.googleapis.com/iid/v1:batchAdd to create a topic in firebase cloud messaging with upto 1000 deviceIds as a payload for push notification. This above mentioned API is intermittently failing and giving
I/O error - connection reset; nested exeption is javax.net.ssl.SSLException : Connection reset.
This failure is very intermittent. Most of the time it works and then it suddenly stops working and starts throwing this connection reset error. After waiting for sometime, it again starts working.

Related

Flutter Firestore Stream closed with status: Status{code=UNAVAILABLE} automatically, Internet is working perfectly

I'm building an app using Flutter and using Firestore as Backend . When I query something to Firestore. It works fine. But after 15-20 mintues when I query again(tapping on button and sending/retrieving data) on Firestore. It shows me the following error shown in picture . When I tap again the query is then send. That means I have to press a button twice, when I'm in AFK(not using Phone or performing any action on Device)
V/NativeCrypto( 4054): Read error: ssl=0xbe39a7c8: I/O error during system call, Connection reset by peer
V/NativeCrypto( 4054): Write error: ssl=0xbe39a7c8: I/O error during system call, Broken pipe
V/NativeCrypto( 4054): SSL shutdown failed: ssl=0xbe39a7c8: I/O error during system call, Success
W/Firestore( 4054): (23.0.3) [WatchStream]: (a4f0ee) Stream closed with status: Status{code=UNAVAILABLE, description=End of stream or IOException, cause=null}.
[cloud_firestore/unavailable] The service is currently unavailable. This is a most likely a transient condition and may be corrected by retrying with a backoff."
The error occurs after 15-20 mintues in the following picture

Spring cloud stream unexpected shutdown is not covered by DLQ

We are using Spring Cloud Stream 2.2 with the Kafka binder. Something we have noticed is if the pod is killed in the middle of doing the job for whatever reason, then we will miss the message to be sent to DLQ.
We are managing exceptions by catching the failure to log it first, and then send the failure to another service to keep track of this situation, and finally throw exception again to be caught by error channel and captured by DLQ. This approach works seamlessly in normal failure, but if the failure has been triggered externally (like unexpected shutdown), then we miss the DLQ part as it seems the corresponding process is killed before reaching out to the error channel. I wonder if this is a known issue as it's impacting the at-least-once guarantee of this framework in our use case.
22:34:48.077 INFO Shutting down ExecutorService
22:34:48.135 INFO Consumer stopped
22:34:48.136 INFO stopped org.springframework.integration.kafka.inbound.KafkaMessageDrivenChannelAdapter#5174b135
22:34:48.155 INFO Registering MessageChannel outbox-usermgmt.event.job-creator-outbox-event-syncs.errors
22:34:48.241 INFO Channel 'application.outbox-usermgmt.event.job-creator-outbox-event-syncs.errors' has 1 subscriber(s).
22:34:48.241 INFO Channel 'application.outbox-usermgmt.event.job-creator-outbox-event-syncs.errors' has 0 subscriber(s).
22:34:48.246 INFO Registering MessageChannel progress-report.errors
22:34:48.258 INFO Channel 'application.progress-report.errors' has 0 subscriber(s).
22:34:48.262 INFO Registering MessageChannel job-created.errors
22:34:48.273 INFO Registering MessageChannel progress-report.errors
22:34:48.350 INFO Channel 'application.job-created.errors' has 0 subscriber(s).
22:34:48.366 INFO Registering MessageChannel job-created.errors
22:34:48.458 INFO Removing {logging-channel-adapter:_org.springframework.integration.errorLogger} as a subscriber to the 'errorChannel' channel
22:34:48.458 INFO Channel 'application.errorChannel' has 1 subscriber(s).
22:34:48.459 INFO stopped _org.springframework.integration.errorLogger
22:34:48.459 INFO Shutting down ExecutorService 'taskScheduler'
22:34:48.467 WARN Destroy method 'close' on bean with name 'genericSpecificFlexibleDeserializer' threw an exception: java.lang.NullPointerException
22:34:48.472 ERROR Job has failed, Fail to retrieve record's full tree, Connection closed unexpectedly
22:34:48.472 ERROR Fail to retrieve record's full tree
22:34:48.472 DEBUG Sending progress update of 0.0 with status of failed
22:34:48.474 ERROR Job has failed, Fail to retrieve record's full tree
22:34:48.538 INFO Closing JPA EntityManagerFactory for persistence unit 'default'
22:34:48.538 INFO Shutting down ExecutorService
22:34:48.541 INFO HikariPool-1 - Shutdown initiated...
22:34:48.543 INFO HikariPool-1 - Shutdown completed.
Code snippet:
try {
...
} catch (Exception ex) {
//capture the failure details in logs
//send failure progress update to another service
throw new JobProcessingException(ex);
}
It appears the framework commits the message before ensuring that the DLQ message is published to Kafka so the offset has moved but the message was skipped as nothing was published to DLQ.
P.S: This scenario happens for us whenever Kubernetes sends a restart signal to the pod for whatever reason like pod eviction, new release, etc. So I suppose if the kill signal was forced then we would not have the commit in the first place and the job was restarted.
This is a known problem - see https://github.com/spring-projects/spring-integration/issues/3450
The issue is that a PublishSubscribeChannel allows zero subscribers and no exception is thrown if there are none.
It has been resolved in Spring Integration (5.4.x) but is still a problem in the binder because it creates a pub/sub error channel by default.
See my comment there...
Yes; I think that solution makes sense; it shouldn't cause any real problems because the default errorChannel always gets one subscriber.
However, it won't solve the problem in the binder because the message producer gets a binding-specific error channel (which is bridged to the global error channel), so we'd need a similar change there.
It should be possible to work around it by declaring the binding's error channel as a DirectChannel #Bean in which case an exception will be thrown if the consumer has unsubscribed (during shutdown). However, this will mean errors will only go to the binding-specific error channel and won't be bridged to the global errorChannel.
https://github.com/spring-cloud/spring-cloud-stream/issues/2082

Still receiving Timeout connects using JS DAX client, classified as an error but no impact

We use the JS amaxon-dax-client (1.2.3) and we receive a bunch of
ERROR Failed to pull from <cluster>.cache.amazonaws.com (-.-.-.-): TimeoutError: Connection timeout after 10000ms
at SocketTubePool.alloc (/opt/nodejs/node_modules/amazon-dax-client/src/Tube.js:227:64)
at /opt/nodejs/node_modules/amazon-dax-client/generated-src/Operations.js:215:30
My understanding is this is a warning, not an error? We don't see any adverse effects (at least not noticeable) and the functions can access DAX just fine, but it pollutes our logs.
If the DAX endpoint isn't properly configured, the DB operations fail which isn't what we're seeing here.

StorageException on resuming Firebase upload task through WorkManager

I'm running a Firebase upload Task through WorkManager. On regular progress updates from UploadTask, I save the session uri in my Shared Preference.
When I switch internet off, Firebase handles the scenario itself and resumes the upload task when internet is turned back on.
But when I power off the phone while firebase is uploading the file and the power on, work manager restarts again and attempts to resume the previous Firebase upload task with last saved session uri and gives the following exception:
E/StorageException: StorageException has occurred.
An unknown error occurred, please check the HTTP result code and inner exception for server response.
Code: -13000 HttpResult: 200
E/StorageException: The server has terminated the upload session
java.io.IOException: The server has terminated the upload session
at com.google.firebase.storage.UploadTask.recoverStatus(com.google.firebase:firebase-storage##16.0.5:354)
at com.google.firebase.storage.UploadTask.run(com.google.firebase:firebase-storage##16.0.5:200)
at com.google.firebase.storage.StorageTask.lambda$getRunnable$7(com.google.firebase:firebase-storage##16.0.5:1106)
at com.google.firebase.storage.StorageTask$$Lambda$12.run(Unknown Source:2)
at java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1167)
at java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:641)
at java.lang.Thread.run(Thread.java:798)
My session uri is saved whenever I get a progress callback from Firebase Upload Task. Maybe thats the problem that I didn't get the latest one when phone was powering off.
Will appreciate if anyone can help identifying the problem here.

Firebase Auth error attempting sign-in offline continues even after coming back online

Description
Calling either signInWithRedirect() or signInWithPopup() while offline will throw the expected error O
{code: "auth/network-request-failed", message: "A network error (such as timeout, interrupted connection or unreachable host) has occurred."}.
However, returning online and calling signInWithRedirect() or signInWithPopup() again will throw the same error. Any attempt to call these functions afterwards results in the same error unless the browser is refreshed.
Expected outcome
Auth sign-in functions normally after coming back online
Actual outcome
Auth sing-in throws an error and continues to do so on any following attempts
Steps to reproduce
Go offline Call either signInWithRedirect() or signInWithPopup() (error should be logged here: O {code: "auth/network-request-failed", message: "A network error (such as timeout, interrupted connection or unreachable host) has occurred."})
Go online
Call either signInWithRedirect() or signInWithPopup() (same error occurs on every sign-in attempt
Can anybody provide a solution to this?
firebaser here
We've been able to confirm this behavior with signInWithRedirect(). This is indeed a bug. We'll fix it in an upcoming version.
Update: This should be fixed in version 4.1.3.
This issue appears to take place on Firebase JavaScript SDK 9.5.0 (and potentially with earlier recent releases).
The steps to reproduce are the same as described above. A few additional observations:
If the first call is made to signInWithRedirect, and it fails due to network connection, subsequent calls both to signInWithRedirect and signInWithPopup result in 'auth/network-request-failed' error even after the network connection is restored.
However, if the first call is made to signInWithPopup, and it fails due to network connection, subsequent calls both to signInWithRedirect and signInWithPopup will succeed after the network connection is restored.

Resources