fsc.exe is very slow because it tries to access crl.microsoft.com - networking

When I run F# compiler - fsc.exe - on our build server it takes ages (~20sec) to run even when there are no input files. After some investigation I found out that it's because the application tries to access crl.microsoft.com (probably to check if some certificates aren't revoked). However, the account under which it runs doesn't have an access to the Internet. And because our routers/firewalls/whatever just drops the SYN packets, fsc.exe tries several times before giving up.
The only solution which comes to mind is to set clr.microsoft.com to 127.0.0.1 in hosts file but it's pretty nasty solution. Moreover, I'll need fsc.exe on our production box, where I can't do such things. Any other ideas?
Thanks

Come across this myself - here are some links... to better descriptions and some alternatives
http://www.eggheadcafe.com/software/aspnet/29381925/code-signing-performance-problems-with-certificate-revocation-chec.aspx
I dug up this form an old MS KB for Exchange when we hit it... Just got the DNS Server to reply as stated (might be the solution for your production box.)
MS Support KB
The CRL check is timing out because it
never receives a response. If a router
were to send a “no route to host” ICMP
packet or similar error instead of
just dropping the packets, the CRL
check would fail right away, and the
service would start. You can add an
entry to crl.microsoft.com in the
hosts file or on the DNS server and
send the packets to a legitimate
location on the network, such as
127.0.0.1, which will reject the connection..."

Related

Server receiving webRequests long after removed from LoadBalancer

I had the following issue on a system that I supported ~7 years ago. We never got to the bottom of it, and focus shifted onto other issues. I was recently reminded of it, and wondered if anyone would know what was going on. But alas I'll be a little short on details. Sorry.
The Setup
I had a farm of web servers sitting behind a load balancer. The servers were hosting a system that would receive HTTP requests (XML &/or SOAP) from clients, then for each one kick-off a bunch of further HTTP requests to 3rd-party-suppliers, wait for the suppliers' responses, process and combine the results and respond to the client's request.
Think insurance comparison, but as Business-To-Business XML service.
The whole processing would take 5s of seconds, from receiving the initial client request to them sending back a response to that original HTTP request, and the server would be processing 10s or 100s of requests in parallel (i.e. at any given point, a given webserver would have many client Requests that had come in, and been logged, but not yet been responded to.)
We had detailed logging which record the reciept of the requests, including origin IP and which server was processing the request, and record when a response was sent.
All client requests were sent to a single IP address (well, URL), which was the address of the loadbalancer, which would then forward requests to the webservers, which weren't individually accessible to the internet (they didn't have public IP addresses).
Our load balancer would allow us to take individual web-servers out of rotation, for maintenance.
When we did that we could watch the DB logs, and see new requests stop coming in, and the existing request gradually get completed, until there were now outstanding requests and the server was idle.
The problem
We found that sometimes, when we took a server out of rotation ... it wouldn't entirely stop receiving requests. You could see the large bulk of request suddenly stop coming in, but it would still receive a trickle of fresh requests (I don't know ... maybe 0.1% of normal load, maybe less?). I think the longest we left it going was maybe ... 10 minutes?
Notably we realised that all of those requests were coming from a single client/IP address (I don't remember which).
I forget whether other (still-in-rotation) webservers were still receiving requests from this client, but I think they were?
If we rebooted the webserver, no further requests would come in after restarting.
Web stack was Windows, IIS, ASP.NET; pretty old school even at the time. All servers individually owned and configured.
What was happening?
We vaguely waved our hands and asserted that the client's integration with us was "holding an HTTP tunnel open and sending multiple requests through it", rather than sending each request separately, and thus was maintaining that tunnel even after the LB stopped sending new requests to that server. But that was BS-waffle, and since we never needed to actually understand what was going on, we ignored it and moved on with our lives :)
But I'd still like to know what we were seeing, if anyone can diagnose it from that description.
We vaguely waved our hands and asserted that the client's integration with us was "holding an HTTP tunnel open and sending multiple requests through it", rather than sending each request separately, and thus was maintaining that tunnel even after the LB stopped sending new requests to that server.
That sounds like a good explanation.
Normally, a LB will refuse new connections to a removed server, but will allow open connections to live on until they naturally close. This is known as "connection draining" or "graceful shutdown".
If one of your clients had HTTP keepalive on, and was holding a TCP connection open and sending HTTP requests through it for a long time, it would give the symptoms you describe.
Most LBs will have a configuration knob for how long to wait for connections to close before force-closing them during this "connection draining" time. You can set a timeout here to avoid this scenario if it is a problem for you.
The HTTP connection handling behaviour of clients will vary at the client's discretion, to a large extent. Perhaps most of your clients were of one type (say, web browsers) and weren't holding open a single connection for 10 mins, but perhaps one client was different (say, a programmatic HTTP API client)?
Further reading about "connection draining" on AWS Load Balancers here (the exact details will vary by LB vendor): https://docs.aws.amazon.com/elasticloadbalancing/latest/classic/config-conn-drain.html
Further reading about HTTP keep alive here: https://en.wikipedia.org/wiki/HTTP_persistent_connection

What are the general rules for getting the 104 "Connection reset by peer" error?

Are there any general rules on when a website sends out a TCP reset, triggering the Connection reset by peer error?
Like
too many open connections
too high bandwidth use
connected for too long
…?
I'm pretty certain that there is no law governing this and that different websites/web developers have different tastes, but I would be interested if there are some general rule sets (from websites or textbooks on the subject or what you have been taught in school/at work) that are mostly followed.
Reason why I'm asking, of course, is that I want to get around being blocked…
I'm downloading some government data that is freely available, but is lacking an API or something, so the two official ways to get it are either clicking around in some web-GIS a few thousand times or going along the Kafkaesque path of explaining various levels of clerks the concepts of databases, csv files, zip files and that you can't (and won't need to, if they'd just did what you try to explain them) just drive to their agency with a "giant" harddrive, so I'm trying to just go the most resource saving way for everyone involved…
A website is not "sending" a "Connection reset by peer" error. This error is generated by the OS kernel on the client site if it gets a TCP reset for an active connection. There are many reasons this TCP reset might be sent. A TCP reset might be sent by design from some kind of load limit, for example to limit the number of connections from the same IP address within a specific time as a form of DOS protection, to restrict data scraping or to enforce some kind of fair use. There is no general rule or even law for this kind of explicit limits.
A TCP reset might also be caused by the application being overloaded, application crashing, system running out of resources ... .
And a TCP reset will happen if the client writes to a connection which the server already considers as closed. This can happen for example with HTTP keep alive: the server might close the connection on inactivity at any time after the HTTP response was sent. If the client sends a new request on the same connection at the same time the server closes the connection, the server will reject the new request (since the connection is closed on the server end) and will send a TCP RST, causing a connection reset by peer at the client. The client needs to properly handle this situation by creating a new connection and sending the request again (provided that the request was not state changing, i.e. is idempotent).

How to limit HTTPS to one TCP connection?

I'm using uIP along with mbed TLS to run a simple web server on a microcontroller, and host an HTTPS page.
The problem is: my chip only has enough RAM to handle one TLS connection at a time, but Firefox (and Chrome) tries to open multiple connections at once to load the images on the page. If I tell uIP to abort or close additional connections, Firefox assumes an error and gives up loading the rest of the page.
I can tell uIP to limit the total connections to 1, and in that case it just drops new SYN packets if there is already a connection. This actually works, as Firefox will wait and try again until the page is fully loaded. I can't use this a solution however, since I do need to allow more than 1 TCP connection total in order to handle other types of connections (I can serve a regular HTTP web page at the same time, for example). If I could tell uIP to limit connections on a specific port to 1 at a time, that may solve the problem, but I don't think uIP has that capability. I also don't see a way to force uIP to drop certain packets.
I've looked all over the web, but I can't find any information on running a web server using just one TCP connection at a time.
Does anyone have any ideas?
Thanks!
Marlon
Just ignore the SSL connection until you are ready to process it. Browsers should tolerate this.

TcpListener stops accepting or accepts broken connections

We currently experience a problem with a self-written server application running on Windows (occurs on different versions). The server listens at a TCP port, accepts connections, exchanges some data and then closes the connections again. There are about 100 clients that connect from time to time.
Sometimes the server stops to work: Log files show that connections are still accepted, but that at the first read attempt a socket error (10054 - Connection reset by peer) occurs. I don't think it is a client issue because it suddenly stops working for all clients.
Now we found out, that the same problem occurs with our old server software, that is even written in another programming language. So it doesn't seem to be an error in our program - I think it has to be some kind of OS / firewall issue? Of course, firewalls have been deactivated, which didn't solve the issue yet.
Any ideas where to look into? Wireshark logs will follow soon..
Excerpt from the log (Timestamp, Thread Id, message)
11:37:56.137 T#3960 Connection from 10.21.13.3
11:37:56.138 T#3960 Client Exception: Socket Error # 10054
Connection reset by peer.
11:37:56.138 T#3960 ClientDisconnected
11:38:00.294 T#4144 Connection from 10.21.13.3
You can see that the exception occurs almost at the same time as the connection is accepted, in this case the client reconnects after a few seconds.
A "stateful" firewall or NAT keeps track of connections, and ought to send RSTs for connectiosn it doesn't know about. If the firewall loses track of connections for some reason, then you'll probably see random connections being reset.
Our router at work does this — it forgets about connections when the PPP connection dies, which is remarkably unhelpful when it rains and the DSL restart takes a bit too long. However, instead of resetting connections, it just drops packets (even more unhelpful!).
Sounds like a firewall or routing issue - maybe stale connections get disconnected after a timeout period. Are you using a ping/keepalive inside your protocol.
Otherwise you may ask Wireshark to see what is going on.
First, thanks for many hints - I'm afraid the problem was a completely different one which you couldn't possibly solve by reading my question.
The server application uses log4net, configured with a log file an ImmediateFlush = true. If every log statement is directly written into the file and multiple socket connections occur this slows down the whole application.
The server needed about a minute to really accept the connection. This was far more than the timeout on clientside. So in the log there was only shown "accepted" followed by "disconnected" - even the log was delayed!
Sorry for the inconvenience...
Have you tried changing the backlog and then see how much time or how many clients are served before this problem occurs
You don't say what Windows versions you're using for the server, but you should be aware that the Windows TCP/IP stack behaves differently in server and client OSes. There are limits on how many simultaneous incoming connections a client OS will allow, and they are significantly lower than you might expect.
What do the logs look like from the client side?
Since the error is stating that the client is dropping the connection; if you see the same error on the client side then it is a firewall or proxy that is dropping the connection (both side seeing the opposite side dropping the connection is indicative of a proxy/firewall).
If the error is not present on the client side; then I would say that your client side is where you will see the actual error.

IIS file download hangs/timeouts - sc-win32-status = 64

Any thoughts on why I might be getting tons of "hangs" when trying to download a file via HTTP, based on the following?
Server is IIS 6
File being downloaded is a binary file, rather than a web page
Several clients hang, including TrueUpdate and FlexNet web updating packages, as well as custom .NET app that just does basic HttpWebRequest/HttpWebResponse logic and downloads using a response stream
IIS log file signature when success is 200 0 0 (sc-status sc-substatus sc-win32-status)
For failure, error signature is 200 0 64
sc-win32-status of 64 is "the specified network name is no longer available"
I can point firefox at the URL and download successfully every time (perhaps some retry logic is happening under the hood)
At this point, it seems like either there's something funky with my server that it's throwing these errors, or that this is just normal network behavior and I need to use (or write) a client that is more resilient to the failures.
Any thoughts?
Perhaps your issue was a low level networking issue with the ISP as you speculated in your reply comment. I am experiencing a similar problem with IIS and some mysterious 200 0 64 lines appearing in the log file, which is how I found this post. For the record, this is my understanding of sc-win32-status=64; I hope someone can correct me if I'm wrong.
sc-win32-status 64 means “The specified network name is no longer available.”
After IIS has sent the final response to the client, it waits for an ACK message from the client.
Sometimes clients will reset the connection instead of sending the final ACK back to server. This is not a graceful connection close, so IIS logs the “64” code to indicate an interruption.
Many clients will reset the connection when they are done with it, to free up the socket instead of leaving it in TIME_WAIT/CLOSE_WAIT.
Proxies may have a tendancy to do this more often than individual clients.
I've spent two weeks investigating this issue. For me I had the scenario in which intermittent random requests were being prematurely terminated. This was resulting in IIS logs with status code 200, but with a win32-status of 64.
Our infrastructure includes two Windows IIS servers behind two NetScaler load balancers in HA mode.
In my particular case, the problem was that the NetScaler had a feature called "Intergrated Caching" turned on (http://support.citrix.com/proddocs/topic/ns-optimization-10-5-map/ns-IC-gen-wrapper-10-con.html).
After disabling this feature, the request interruptions ceased. And the site operated normally. I'm not sure how or why this was causing a problem, but there it is.
If you use a proxy or a load balancer, do some investigation of what features they have turned on. For me the cause was something between the client and the server interrupting the requests.
I hope that this explanation will at least save someone else's time.
Check the headers from the server, especially content-type, and content-length, it's possible that your clients don't recognize the format of the binary file and hang while waiting for bytes that never come, or maybe they close the underlying TCP connection, which may cause IIS to log the win32 status 64.
Spent three days on this.
It was the timeout that was set to 4 seconds (curl php request).
Solution was to increase the timeout setting:
//curl_setopt($ch, CURLOPT_TIMEOUT, 4); // times out after 4s
curl_setopt($ch, CURLOPT_TIMEOUT, 60); // times out after 60s
You will have to use wireshare or network monitor to gather more data on this problem. Me think.
I suggest you put Fiddler in between your server and your download client. This should reveal the differences between Firefox and other cients.
Description of all sc-win32-status codes for reference
https://learn.microsoft.com/en-us/windows/win32/debug/system-error-codes--0-499-
ERROR_NETNAME_DELETED
64 (0x40)
The specified network name is no longer available.

Resources