Java Nats Client: Incomprehensible error message "Subject remapping requires Options.oldRequestStyle() ..." - nats.io

In our project we have used NATS (Server-Version 2.1.9) as our message server.
Often we got some error logs in Java Nats Client like:
`ERROR: Subject remapping requires Options.oldRequestStyle() to be set on the Connection`
It comes from io.nats.jnats version 2.8 (nats.client.impl.NatsConnection # deliverReply).
But we don't notice any problem in our system.
So my question is: How can it happen, what does it warn us about, and how can we handle it?

Please see the response to our github issue that addresses this: https://github.com/nats-io/nats.java/issues/424
In short, this is fixed in our 2.9.0 version. I answered in the issue:
We are able to reproduce this under those conditions of re-using messages. I had also seen it intermittently but couldn't quite put my finger on it.
As it turns out, the error message is a misleading anyway. It happens when we can't match the incoming message with the outgoing one / future that made the request. This should actually never happen. Since it was out-of-band (async) there really wasn't much we could do so printing a messages was better than nothing.
The good news is the current 2.9.0-SNAPSHOT has a fix regarding message handling that addresses this problem. Can you try upgrading to 2.9.0-SNAPSHOT and see if this fixes your problem? 2.9.0-SNAPSHOT is stable and backward compatible since it is almost exclusive JetStream additions and improvements.

Related

Deliberately sending the wrong status

the Front End developer asks us to restore several APIs even if the system has a problem and restore it to the correct state (HTTP 200). In your opinion, this problem of action, that is, reversing the wrong situation in the future, does not cause a problem

What is the recommended action on receving ChangeFeedObserverCloseReason.ObserverError?

My ChangeFeedProcessor's IChangeFeedObserver.CloseAsync callback was invoked with ChangeFeedObserverCloseReason as "ObserverError". So, far I have seen this error only once and I am not sure how to repro it. What causes this error? Is there a way to diagnose this more? Is there any recommended action that one should take after receiving this error?
From your question, I understand you are using Change Feed Processor.
That Close reason is provided when the code in your ProcessChangesAsync implementation throws an unhandled exception.
Basically, if that happens, it means your code had an error processing the changes so:
The Observer is closed, releasing the Lease
The lease becomes available to be picked by any Host instance
The lease gets picked up by a Host, an Observer started, the same batch of changes gets sent to be processed
If the nature of your error was transient, then this time it will work (hopefully). If its not transient, then you will again face an ObserverError.
As a rule of thumb, always try to manage your exceptions if possible, if not, it will be treated as a temporary scenario and will get eventually retried as I described.
Also please next time give more context, describe which libraries and versions you are using and provide some related code. It will help a lot understanding and diagnosing.

is a more specific HTTP code a "Semver breaking change"?

Lets say my version of my SemVer API is 2.0.0
Suppose I have an endpoint /foo that right now that erroneously always returns a {200, "some message"} or {500, "some message"}
I fix a bug so that I now detect bad requests, and now I return {200, "some message"} or {400, "some message"} or {500, "some message"}
Is this considered an API Breaking change in SemVer? The user may not be expecting the 400, so I can see the case for 3.0.0, however I can also see that this should have been a 400 on a BAD REQUEST all along, because in some sense "HTTP" is my API, hence a patch fix to 2.0.1, so I am torn.
I think that before answering this, you must determine philosophically where responsibility lies. Namely, if client code is not capable of handling a new (correct) status code, is that break fairly the result of your change or the result of poorly written client code. If you had previously been sending some useful information along with your 500 responses, then it would be reasonable to expect that client code may have relied upon it.
To me, it sounds like the very definition of a minor change:
MINOR version [is] when you add functionality in a backwards-compatible manner[. . .]
Actual internal server errors are still sent properly and successful results are still sent properly; what you've done is simply added the ability for clients to determine from where errors arise.
edit: So instead of 2.0.1 or 3.0.0, I believe this would be 2.1.0.

Is there any reason a website should return appropriate HTTP status codes?

I'm working on a small web application, and I'm trying to decide if I should make the effort to emit semantically appropriate HTTP status codes from within the application.
I mean, it makes sense for the web server itself to emit proper response codes. 500 Internal Server Error for a misconfigured Apache or 404 Not Found for a missing index.php or whatever all make sense, since there's nothing else the server can really do.
It also makes sense to manipulate the browser with 303 See Other or other HTTP mechanisms which actually produce behavior.
But if all that happened is a missing GET parameter, for example, is there any reason to go out of my way to return 400 Bad Request? Or how about 404 Not Found, if my application is handling all the routing by itself? From what I can tell there isn't any behavior associated with either of those error codes.
My general opinion: provide codes if the code provides actionable data for the user.
If all you're doing is presenting content, then in most cases I think it's less important. If YouTube fails to load a video, I mostly care about the fact that I can't watch my video. That it failed with a 418 status might be intellectually interesting, but it doesn't really provide me with any helpful information (Even assuming a non-silly failure code).
On the other hand, if you're allowing some kind of user interaction with a server, then the codes become much more important. I might actually care about why my request failed, because I'm now in a position to do something about it.
However, there are some codes that are actionable. 410 Gone for example: If my request failed for that reason, but I just got back a generic "Stuff Broke" message, I'd probably repeat the request a bunch of times, get nowhere, and give up in frustration. Knowing that the thing I'm looking for doesn't exist is a pretty useful thing for me to know.
I think its very important for a web service to respond with appropriate codes as sometime the developer using the service might not know whats wrong or why the app stopped working unless he views the status code.

Azure SqlException: Database on server is not currently available

Our site has been running for a few weeks in Azure without getting this error:
SqlException: Database 'database' on server 'server' is not currently
available. Please retry the connection later. If the problem
persists, contact customer support, and provide them the session
tracing ID of 'guid'.
It finally got that one day when there were a little over 2K of active (concurrent) users. This is the closest question that I can find in SO. We are not using EF though but rather we're using Dapper. I'm out of ideas how to debug our application to find out what caused the issue, and it's even harder now that the issue has not come up for the past 2 days. I definitely need to be on the lookout and I need you guys, any tip, on where I should be looking into, what I need to do to determine the cause of the issue, and possibly fix it.
It sounds like you need to handle transient failures via some sort of transient fault handling mechanism. Here is post asking a similar question:
SQL Azure Database retry logic David's answer is similar to the approach we took do deal with the issue.
Here is another link to some code that is similar to the David's and our solution to get your head around it. http://www.getcodesamples.com/src/4A7E4E66/41D6FAD
We had similar issues when we first moved to SQL Azure but by implementing back-off retry logic for the transient connection issues the majority of the time it recovers after a few seconds.
We went down the path of handling transient errors with the Azure Transient Fault Block, but this caused bigger issues - namely, if you reach the SQL connection limit (easy to do), having retry logic in place only makes things worse.
If it only happens once a month, I'd leave it be, and just handle it gracefully higher up the stack. An alternative is to create a custom retry policy to avoid retrying on certain errors, but it may still do more harm than good.

Resources