Backend doesn't respond to HTTP requests - http

I know that the information that follows is most certainly not enough to solve my problem and I may specify what is needed when and if needed.
The situation is the following. I was programming normally and, as I did, I made a PUT request to my application backend. The browser tab suddenly logged many (like, many) errors concerning different information which I cannot remember due to the fact that it closed itself soon after. Almost at the same time, both VSCode windows I had open and running the backend and frontend of the application closed themselves.
Since then, the backend does not appear to answer any HTTP request made to it. It doesn't get to the point in which the code of the controllers run, for I have put a console.log there and nothing is logged. Similarly, no errors are shown in the console when the request is made. It still connects to an MQTT broker, as it was supposed to.
The Insomnia request loads forever unless I cancel it, and, Insomnia still notices when the connection is reset, giving the
Failure when receiving data from the peer
error.
Lastly, the frontend created a debug.log file in the project's folder that contain 21 lines that read
[0624/203732.834:ERROR:crash_report_database_win.cc(428)] unexpected header
with the only difference being the numbers in the start, and, also created a yarn-error.log, a very long log with a line that caught my eye (because it had the word "Error" in it) that reads, among other things,
Trace: Error: EBUSY: resource busy or locked...
I have no clue what happened or what I should do.

Run the following:
npm install express

Related

Some requests on IIS hang for minutes and end in a lost connection

I have an awkward issue with IIS 10.0 on Windows Server 2016 and ASP.Net 4.5.2 and MVC 5.2.7.
At times, certain requests do not receive a response and run for minutes, maybe 10 or so, before ending in a lost connection (PR_CONNECT_RESET_ERROR in Firefox on Windows, NSURLDomainError in Firefox on iOS). These are mostly POST requests. When this issue occurs, other GET requests will receive a swift response and a correct result. Normally, POST-request do no take long to be processed, typically less than 3 seconds.
Recycling the associated worker process will make the issue go away, for hours or days.
When today inspected the web server when the issue was going on, I saw little CPU usage, less than 10%, memory 56%, the worker process a modest 615 MB. I saw neither logging in the W3C log of these requests, nor in my custom application logs.
I added the Web-Request-Monitor conform How do I see currently executing web request on IIS 8, but in doing so, the the worker process probably got recycled, as the issue is not currently occurring.
There are a reverse proxy and an access manager between the internet and my web server. I suppose they can have something to do with this issue, but it certainly is related to IIS, as recycling helps.
All of this is happening on a acceptation web server running a newer version of my application. I am not aware of any big changes to the application's architecture that could be involved. Also, there will be very little traffic from other clients, if none at all.
What could be next steps to investigate this issue further?
Update
This issue was definitely caused by log4net. However, it was not related to the log4net.Internal.Debug setting. It was caused by two application domains accessing the same log file. This occasionally resulted in concurrency issues with accessing the log file. It appeared that log4net could not properly handle this and got stuck while writing to the log file.
This log file was configured with the RollingFileAppender option. Since we also used AdoNetAppender, we decided to remove file logging all together.
Original
I have found a probable cause. I'll report the steps I took to investigate the issue.
I activated the Worker Processes feature in IIS.
When, after a couple of days of waiting, the issue started again, I found long running requests. They all had State ExecuteRequestHandler and Module Name ManagedPipelineHandler. They had Time Elapsed of hundreds of seconds.
I also activated the Failed Requests Tracing with a rule for long running requests with a Time Taken of 1 minute.
After a couple of days, I started to receive failed request reports. The failed request all have a GENERAL_SET_RESPONSE_HEADER event as their last event.
I added additional debug logging events for each requests. When debugging in my development environment, at one point, I started to see the hanging behaviour there, on one of the new logging statements(!). The application uses log4net.
I captured a stack trace:
log4net.dll!log4net.Appender.AppenderSkeleton.DoAppend(log4net.Core.LoggingEvent loggingEvent) log4net.dll!log4net.Util.AppenderAttachedImpl.AppendLoopOnAppenders(log4net.Core.LoggingEvent loggingEvent) log4net.dll!log4net.Repository.Hierarchy.Logger.CallAppenders(log4net.Core.LoggingEvent loggingEvent) log4net.dll!log4net.Repository.Hierarchy.Logger.Log(System.Type callerStackBoundaryDeclaringType, log4net.Core.Level level, object message, System.Exception exception) log4net.dll!log4net.Core.LogImpl.DebugFormat(string format, object arg0)
The DoAppend method uses lock(this), which may very well cause hangs.
I also found out that the config setting log4net.Internal.Debug was set to true, which I do not want under normal circumstances and this may be related. I did not attempt to understand the log4net code, but I remember that logging initially did not work, in the acceptance environment, so the setting may very well have been set to true then, causing the issue to start.
Another indication that this is happening with log4net is that when the issue last occurred, I realized that logging of level standard, only occurs in some POST requests. I found a POST-request that does not log and requests to it where handled normally, while the other POST-requests still hung.
For now, I have set log4net.Internal.Debug to false and will wait to see what happens.
IIS recycle fix this issue doesn't mean that this is an IIS issue because all asp.net application run in .net runtime unless it is proved that the request is hang in IIS module.
So you may need to wait this issue happen again, then create a Failed request tracing rule for time-taken. Then it will tell us this issue is happening on IIS pipeline module or .net runtime.
If all request hang in .net runtime. Then you may have to capture a hang dump and do a deep analysis via WINDGB and mex extension. It will tell us what's happening there.

Can a web server begin responding before the client has sent the full request?

I am writing a web application for an academic research group. The researchers need to be able to upload large data sets (100MB - 1GB) in CSV format. I've written the server to process the data as it comes in. This means that if there is an error in the first row of the CSV, we can return an error straight away.
However, when this happens, the browser reports that "The connection was reset" or similar. Clearly, my web server is responding in a way that doesn't make sense.
If I explicitly close the HTTP request stream (this is Kotlin on the JVM by the way) before returning the error to the browser, then the problem goes away. However, it turns out that the close implementation of the request stream first goes and reads the whole stream to its end. So at that point the user still has to wait 30mins+ to find out that there is an error in the first row of their CSV.
Is what I am trying to do possible? Does the HTTP protocol permit a web server, in any circumstances, to begin responding before the full request body has been sent? If not, can you suggest a workaround that would allow me to deliver a user experience where the user doesn't have to wait for the whole file to be uploaded before finding out if there are any problems?
The answer is yes, according to the http spec servers should be able to send responses early and the client should stop sending the request body. Most browsers however, don't implement this correctly.
In theory, your http server needs to return a 4xx error code with a response body, then reset the connection to prevent the upload continuing in the background. See the answers below for a more detailed description of the issue. There are a couple of browser versions that do support this, so if you're doing this in lab conditions where you can control the client being used the links below will help.
https://stackoverflow.com/a/14483857/2274303
https://stackoverflow.com/a/18370751/2274303
[edit]
To answer your question about using a workaround, chunking the uploads using javascript is a good way to mitigate internet connectivity issues, but if you want to parse it in real time it's not as simple as arbitrarily breaking up the file into pieces. You need to make sure you're not splitting the file in the middle of a line, otherwise it will fail even if the data is valid. That brings up the issue of parsing a 1GB file in javascript, which isn't a good idea imo.
If you want to use javascript, continue uploading the entire file at once via an ajax request, so you can get the response outside of the main dom and force a redirect or cancel the upload. Depending on which js libraries you're using there are different ways of doing this.
None of this solves the reverse scenario. What if the file is 95% uploaded before there's an error? The researcher will need to either upload the whole thing again or edit the file to only include the rows from the error going forward. That means your application needs to support partial uploads and know to pick up where it left off. All these things are possible, but you're probably not going to find a simple workaround to get this working well.
Without understanding the dataset and what kind of validation you are doing it's hard to come up with a full solution. If parsing each row doesn't depend on the previous rows being valid, you could always upload the whole file, then display the rows with errors at the end and ask them to upload a second file with just the corrections.
The normal process of a HTTP web server happens like:
Server listens for request
Client creates request
Client sends request to server
Server processes request
Server creates response
Server sends response to client
Client processes response
The client starts the connection for communication and the server is able to respond on that connection, however if you close the connection the server will need to send a response on another connection. The browser may not allow the server to start a new connection that the client didn't request.
You may be able to respond by reading the first line and creating an error quickly, but the client will not read the response until it is done sending the request.
By sending the file in chunks or asynchronously sending lines of the file, you will be able to give feedback more immediately. You will be sending many smaller requests with the ability to respond in between.
The question was about HTTP protocol. I feel like this would be allowed by the protocol if you wrote a custom app and web app, however if you are using browsers then you must use HTTP as the companies have implemented it. In a custom app you could check for interruptions however most browsers will probably fire a full request before listening for a response, which is also a reason AJAX took off 20years ago.

Application Insights removing telemetry after it has been logged

I've had Application Insights set up on my ASP.NET project for a couple months with no issues. I use Custom Events for logging certain events.
Recently, I tried to add a Custom Event after a user has authenticated in order to track the login behavior. My custom event DOES log to application insights debug session. I know this because I can see it in the telemetry when paused on a breakpoint just after the event.
However, when I continue running the application, my custom event no longer shows up the telemetry. It just disappears.
I cannot understand what the issue is. Does anyone familiar have any (application) insights? I couldn't help myself ;)
There are some things to check:
are you logging to one resource (iKey) and searching on another? (a lot of people send data to one resource in dev/debug and a different resource in release/prod environments. so make sure you're sending to the place you expect, and searching the place you expect.
is the data actually going out successfully? you may need to use fiddler or some other tool to watch your outbound http for calls to dc.services.visualstudio.com. It could somehow be the case that there's something wrong with the data you're sending, or maybe you're getting capped or throttled by the service. If that's the case, the outbound requests will have responses other than 200, and will generally tell you the reason it didn't accept any items that it rejected.
if the data is getting successfully sent and is going where you expect it to go, there might just be a delay in backend processing. you can always check aka.ms/aistatus to see if there are any current issues with the service.
I am confused, however, by what you mean when you say
However, when I continue running the application, my custom event no longer shows up the telemetry. It just disappears.
What do you mean "it just disappears" ? if you see it in the output window, then the SDK saw it, and it will get sent, precluding any of the above 3 items. Where is it "disappearing" from? unless you clear the output window, it's never gone from there. If you're talking about the VS search tools that show data sent by the AI SDK during debug, that tool currently has a cap of the most recent 250 items that have occurred during the debug session.

Debugging requests which are 'stuck' in an IIS worker process

In case of TL;DR - I basically need guidance regarding what tools are available to debug requests which are issued to IIS and which stall inside a module.
I have a problem with an old ASP 2.0 app at the moment whereby it will periodically become unavailable and recycling the app pool (horrible as that may be) doesn't bring it back up 100% of the time.
So first of all it presents itself as requests entering the app pool and being trapped in state 'BeginRequest' in RewriteModule.
It is not a specific request which is always the first to experience this issue. The issue cannot be easily recreated either.
Eventually requests join this backlog and when it becomes 70+ deep the app pool fails to respond to pings from WAS and it forcibly recycles. Predictably it doesn't stop on-time and the old app pool is forced to stop. When the new app pool comes up it either works just fine or it instantly experiences the same issue as the outgoing one and requests begin to queue.
In issues like this all the official guidance is understandably focussed around looking at why the RewriteModule may choke.
I have validated my redirections and though complex there are no obvious issues with syntax (XML validates).
Likewise in inetmgr loading up the URL Rewrite Module seems to parse the configs fine and show them visually.
Basic stuff like permissions is all fine.
When the app is working normally I also used Failed Request Tracing/Logging to look at the request pipeline for a sample URL which stalled and I can confirm that there is no circular logic or weird errors presenting - the request seems to be handled just fine. This also showed me how high up the rewritemodule is invoked and from this I really don't see how the issue could be app-related as .NET isn't invoked at this point.
Annoyingly when an app pool is experiencing this issue and I can throw in requests which just stall Failed Request Tracing is no good because you actually need a request to get to the end of it's journey and fail otherwise it refuses to log anything out.
I resorted to taking process dumps of affected w3wp.exe's and running them through DebugDiag. Unfortunately the only thing I see is that threads are open accessing the rewritemodule but precious little about what they are stuck on.
As anyone else would do I've tried to track the start of the issue back to any recently installed patches or code changes but nothing matches. Likewise this is happening on 3x servers otherwise I would try reinstalling the rewritemodule. Other sites on the same server which invoke rewritemodule are unaffected.
Has anyone else experienced issues like this - the net seems to have relatively little info in this case. Perhaps you can recommend further debugging tools or approaches for IIS which I can adapt to this scenario? This is sort of a cry for help from someone more used to Apache/Nginx - sorry for the long post.

synchronizer - unable to get client-side resource

I already put this into the old forum so I hope this will be fine.
Suddenly in one location users to the CMS side now are getting errors. If they work elsewhere there is no problems. I know the forum usage is low but if I shall slap the network people silly I need to have some pointers.
User gets several errors during the loading homepage process.
Err 1: A few times: JavaScript alert -
[synchronizer] unable to get client-side resource with ID xxxx
Err 2: Sometimes:
Unspecified error. on /library/javascript/mdvc.js
Err 3: several times:
A GUI system error occured. Details:[CmdsHTTPDone]
<tcmapi:Response xmlns:tcmapi="http://www.tridion.com/ContentManager/5.0/TCMAPI" success="false" actionWF="false" ID="WebGUIResponder.aspx"><tcmapi:Error><tcm:Line Cause="true" mlns:tcm="http://www.tridion.com/ContentManager/5.0"><![CDATA[Request message cannot be empty. ]]></tcm:Line></tcmapi:Error></tcmapi:Response>
Err 4: Sometimes we also get "permission denied" errors on TaskBarControl.js or other scripts.
In the end.. all views empty.
When trying to use a web proxy tool (Fiddler2) to see what is sent/received; user do NOT get any problems. Can log in and use the CMS without any problems. As long as the local web proxy tool is used, user have no problems with the CMS. As soon as tool is shut down, same problems come back.
So using this tool, we cannot even debug as we don't know what impact fiddler has on the connection making it work. Just in one location for Prod and Test (same issues) but DEV still is fine.. so my deduction is.. "some rule in the local network" is wrong - but how to proceed?
The CME GUI loaded in the browser reguarly checks back with the CME server. This looks like the browser cannot get a connection with the CME server.
For further troublehsooting you can try what happens if you do a full reload (CTRL-F5) of the web browser to see if it has a connection issue indeed.
If it is a connection issue it might not be Tridion related at all.
This is probably a proxy issue -- especially since you say that you cannot reproduce it using Fiddler. Fiddler works by acting as a proxy, so that would explain the lack of symptoms when using it.
You can try just using your browser's developer tools (press F12). Then watch for any requests that come back with a different status code than 200 or 304. You can then show this to your network team who can hopefully troubleshoot the issue from there.

Resources