If application server crashes and huge log is pending to be persisted into the log file, how log4j Asynchronous Logger handle to avoid missing any log? Or is there any mechanism to retrieve the log?
In graceful shutdown no log is lost, whereas, in crash or killing process, may lose the log and I could not find any approach to return the lost logs.
Related
I'm using gRPC for internal communication b/w 2 java services.
I configured gRPC retry using service config . I am able to get retry count in server using "grpc-previous-rpc-attempts" metadata header. However , I don't find any logs those are getting printed in the client app while retries are happening .
why gRPC is not logging retry attempts which ideally should have
been done when retries are configured
Is there any way to log each
retry attempt in the client app? This is needed for better
observability.
Thanks
At this moment there is no logging support for retries, but it would be a reasonable thing to add.
You can either file a feature request to get that added, or better yet make a pull request for the change. If you decide to make the change, it should be localized to just RetriableStream.java. Feel free to tag me (#temawi) on it and I'll review it for you.
I have a Single Page Azure Web App that uses Signalr(Microsoft.AspNetCore.SignalR" Version="1.0.0-alpha1-final) to broadcast events(login, logout, department creation etc) to connected clients.
I also scale my application to several instances at peak times and i use Redis Cache Backplane (Microsoft.AspNetCore.SignalR.Redis" Version="1.0.0-alpha2-final) to distribute event broadcast messages to all connected clients
i use angular front end (#aspnet/signalr-client": "^1.0.0-alpha2-final)
on azure, i enabled diagnostic log to log information and error messages.
the above works fine but when i scale up the application, it is difficult to trace information or error messages as i have to look through up to 10 instance application logs to find information or error;
my question: How do i ensure redis cache logs error messages or information on all available instances rather than on instances where client is connected; how do i know if a client has missed out on event broadcast message? how do i ensure signalr sever/hub logs all messages on all application log instances?
thank you in advance
The best way to do this might be to use the redis-cli, run the MONITOR command and then pipe that to a file (or somewhere that can store the logs).
I get Timer_ConnectionIdle message from error logs of httperror folder in system32/logfiles.
And sometimes the web page return service unavailable or connection refused.
What is the problem?
How can I solve that?
Two Different issues that you are talking about.
Timer connection idle is not something you need to be worried about. It is HTTP.SYS's way of telling you that the client with which it established a connection did not disconnect because there is always a chance that the client would want to establish the connection again. I think it usually waits for 2 minutes before terminating the connection and that is when you get this message in the HTTPERR logs.
Now coming to Service Unavailable and Connection Timeout errors, this is something that you need to take note of. Check for event logs during the time of issue and see if you find anything there.
If you are unable to find anything in the event logs, my next question would be to identify what is done in order to overcome the issue ? Do you recycle the application pool to get the application up and running ? Do you reset IIS ? If you do any of the above, then please capture a full user dump of w3wp process using debug diag during the time of issue (before performing an iisreset or application pool recycle). Analyzing the dump will tell you exactly whats going wrong.
Feel free to follow up with any questions you have.
My understanding of the (JavaScript) hub client is that if a connection is lost, it enters a 'Reconnecting...' phase which attempts to reconnect. If it can't do so, it will enter a 'Disconnected' state which is where it'll stay until asked to start again.
How long is the 'Reconnecting...' phase meant to last before it gives up? I've read 40 seconds before, but my client seems to take much less time - about 10, maybe less. [EDIT: Nevermind this part, I had configured a 10 disconnect on the server as a test... and forgot. I understand this is set by the server during the negotiate. Makes sense!] ... I'd prefer to have the client continually retry until it is told to abort - can this be done, and would it cause issues?
Another question; during the Reconnecting... phase, if I attempt to call a hub method (again, in JS) it never seems to complete. I'm using the returned Deferred to check for 'done' and 'fail' events, but neither seems to get called. Is this by design?
Thanks.
You can definitely have it continually reconnect.
Handle the disconnected event on the client and call connection.start:
$.connection.hub.disconnected(function() {
setTimeout(function() {
$.connection.hub.start();
}, 5000); // Re-start connection after 5 seconds
});
The only issues this would cause is that you could potentially be triggering infinite requests to a server that isn't there for client machines. This becomes even more troublesome when you introduce the mobile market into the situation (drains battery like crazy).
When you attempt to call a hub method while reconnecting SignalR will try to send your command. Since there are 2 channels, one for receiving data and one for sending, (for all transports except web sockets) in some cases it can still be possible to send requests while your offline. Therefore SignalR does not know if a request fails until the browser tells it that it could not successfully make the request.
Hope this helps!
I might have a clue... Touching the Web.config produces an appPool Recycle, meaning that a new worker process will be created for new requests while the existing process will continue for a while until the remaining requests end or the timeout is reached. Request that do not end in the timeout period are terminated.
Signalr client reconnects to the new process while the long running task is running in the old process, so when on the long running task you do
GlobalHost.ConnectionManager.GetHubContext<ForceHub>();
you actually get a reference for "old" hub while the client is connected to the "new" hub.
That's why the test preformed by Wasp worked: he was making a new request to publish on the signalr hub that was processed in the newly created worker process.
You could try to configure a singalr backplane (https://www.asp.net/signalr/overview/performance/scaleout-in-signalr), it’s really easy to configure it using Sql Server (https://www.asp.net/signalr/overview/performance/scaleout-with-sql-server). The backplane should be capable of connect the two worker processes and hopefully you will get the notification on the client.
If this is the problem, notifications generated by new requests will work even without the backplane. Notice that the real purpose of the backplane is to scale out signalr, this is, to connect a farm of WebServers between them.
Also keep in mind that running long-running task inside IIS is as task hard to achieve as, among other things, IIS does regular appPool recycles and has timeout limits for the requests to execute. I recommend that you read the following post: http://www.hanselman.com/blog/HowToRunBackgroundTasksInASPNET.aspx
“If you think you can just write a background task yourself, it's likely you'll get it wrong. I'm not impugning your skills, I'm just saying it's subtle. Plus, why should you have to?”
Hope this helps
How can I resume a dehydrated orchestration ?
the orchestration in question should have been retrieving messages from a MSMQ queue
but the userid permission wasn't set on the queue, so the BizTalk box wasn't able to read from the queue
Corrected the permissions, but the only options are teminate and suspend ?
If the orchestration attempted to start and failed on the MSMQ receive, it's essentially hung and has not removed a message from the queue. I'd terminate it. The orchestration should clear and pickup the new messages. Does your orchestration implement a singleton pattern or are you using ordered delivery on the receive? This makes things a little more complicated.
Shouldn't you be restarting the biztalk service instance for MSMQ?
Dehydrated means the orchestration is still waiting for something. I guess in your case, you must be waiting for a corelated message from MQ. If you restart receive host service instance, it will try to reconnect all connections (MSMQ, SQL, etc that managed by the service instance). Then all messages will be flow through to orchestrations.
update 1:
Check the relevant receive location. Maybe it got disabled by biztalk due to the permission problem. You will have to enable it manually.
update 0:
Your don't have to resume dehydrated orchestration. It's not the orchestration that read from the queue, but the msmq adapter. When a msmq message arrive the receive location will route it into the message box. If the said orchestration have a subscription ( receive port ) that match the msmq message then it will be resumed by the biztalk engine.
Can you suspend, then resume?
It's been a couple years since I did BizTalk. Quirks like this were annoying. Even worse when it's 250k dehydrated and you need to script to restart them. ugh
I feel for you.
BizTalk's ability to resume depends on the place and way it failed, and whether it can replay any part of the operatio; in most cases, when failing in an orchestration, some coding pattern need to be used to allow it to resume.