Creating an idle timeout in Go? - http

I use CloudFlare for one of my high volume websites, and it sits in front of my stack.
The thing is CloudFlare leaves idle connections open in addition to creating new ones, and it's not a setting I can change.
When I have Varnish or Nginx sitting in front listening on port 80 they have out of the box configuration to hang up the idle connections.
This is fine until I had to add a proxy written in Go to the front of my stack. It uses the net/http standard library.
I'm not a Go wizard but based on what people are telling me there are only read and write timeout settings but not hanging up idle connections.
Now my server will fill up with connections and die unless I set a set read and write timeouts, but the problem with this is my backend takes a long time sometimes and it's causing good requests to be cut off when they shouldn't.
What is the proper way to handle idle connections with Go http?
Edit 1: To be more clear, I'm using httputil.NewSingleHostReverseProxy to construct a proxy, which exposes transport options but only for the upstream. The problems I am having are downstream, they need to be set on the http.Server object that uses the ReverseProxy as a handler. http.Server does not expose transport.
Edit 2: I would prefer an idle timeout to a read timeout, since the later would apply to an active uploader.
Thanks

The proper way to hangup idle connections in the Go http server is to set the read timeout.
It is not necessary to set the write timeout to hang up on idle clients. Don't set this value or adjust it up if it's cutting off responses.
If you have long uploads, then use a connection state callback to implement separate idle and read timeouts:
server.ConnState = func(c net.Conn, cs http.ConnState) {
switch cs {
case http.StateIdle, http.StateNew:
c.SetReadDeadline(time.Now() + idleTimeout)
case http.StateActive:
c.SetReadDeadline(time.Now() + activeTimeout)
}
}

See the net/http.Transport docs. The Transport type has some options for dealing with idle HTTP connections in the keep-alive state. From reading your question, the option that seems most relevant to your problem is the MaxIdleConnsPerHost field:
MaxIdleConnsPerHost, if non-zero, controls the maximum idle (keep-alive) to keep per-host. If zero, DefaultMaxIdleConnsPerHost is used.
Reading the code, the default is 2 per host.
The Transport type also has a method to zap all idle connections: CloseIdleConnections.
CloseIdleConnections closes any connections which were previously connected from previous requests but are now sitting idle in a "keep-alive" state. It does not interrupt any connections currently in use.
You can specify a Transport on any http client:
tr := &http.Transport{
TLSClientConfig: &tls.Config{RootCAs: pool},
DisableCompression: true,
MaxIdleConnsPerHost: 1,
}
client := &http.Client{Transport: tr}
resp, err := client.Get("https://example.com")
Another thing worth noting: the docs recommend that you keep a single http client object that is re-used across all your requests (i.e. like a global variable).
Clients and Transports are safe for concurrent use by multiple goroutines and for efficiency should only be created once and re-used.
If you are creating many http client objects in your proxy implementation, it might explain unbounded growth of idle connections (just guessing at how you might be implementing this, though).
EDIT: Reading a little bit more, the net/httputil package has some convenience types for reverse proxies. See the ReverseProxy type. That struct also allows you to supply your own Transport object, allowing you to control your proxy's idle client behavior via this helper type.

Related

https and websocket handler with different deadlines

I have a toy proxy server that accepts connections on a port. I set some deadlines for read/write operations to avoid having too many idle connections from bad clients that fail to close properly.
The problem is that I would like to set a higher deadline for connections that are towards websockets (wss in particular). For plain http requests I can see the 101 Switching Protocols response but https/wss is trickier since I mostly do a io.CopyBuffer from src connection to dst connection and I don't see anything "websocket related" in the initial proxy connection in order to differentiate between https and wss and apply the proper deadline.
I've included a debug screen to such a request towards a wss:// demo server.
Any ideas?
One cannot reliably distinguish between "normal" HTTP traffic and Websockets just from looking at the encrypted data.
One can try to do some heuristics by looking at traffic patterns, i.e. in which direction how many data are transferred in which time and with which idle times between data. Such heuristics can be based on the assumption that HTTP is a request + response protocol with typically small requests shortly followed by larger responses, while Websockets can show arbitrary traffic patterns.
But arbitrary traffic patterns also means that Websockets might be used in a request + response way too. (including request + response though). Also, in some use cases the usage pattern for HTTP consists of mostly larger requests followed by small responses. Thus depending on the kind of application such heuristics might be successful or they might fail.
It is always a good practice to define global Server timeout to make sure that resources are not locked forever. That timeout should be not smaller that longest timeout in all handlers.
DefaultServer = &http.Server{
Handler: http.TimeoutHandler(handler, wssTimeout, timeoutResponse),
...
}
In handler that processes http and wss requests, we need to set timeout dynamically.
func (proxy *ProxyHttpServer) handleHttps(w http.ResponseWriter, r *http.Request) {
// Request Context is going to be cancelled if client's connection closes, the request is canceled (with HTTP/2), Server we created above time outed.
// all code down the stack should respect that ctx.
ctx := r.Context()
timeoit := httpTimeout
if itIsWSS(r) {
timeout = wssTimeout
}
ctx, cancel = cWithTimeout(ctx, timeout)
defer cancel()
// all code below to use ctx instead of context.Backgound()/TODO()

netty client + keep-alive=true

I'm confused for how to deal with lots of connections in netty (3.6.2.FINAL) and keep-alive=true.
For work on a netty client as a server side connector, making http calls to another service, it wants to always keep the connection open for performance (keep-alive=true).
The issue: there is a hard limit for number of open channels, after which the client will hang when attempting to open a channel. Why no exception just hangs? Is this a setting in terms of channel timeout?
I can't seem to understand Netty in terms of overall managing of connections within worker threads:
With a blocking write/read client ChannelHandler (http request/response), how do you detect that the connection pool is empty?
The handler can receive ChannelEvent(s) but nothing about the overall count available in the connection pool (its very non-deterministic anyway). And if the channel is not open, does it make sense for the handler to initiate opening a new channel given its running in a worker thread?
But if the connection pool is exhausted, how do you go and cleanup some idle connections (within the handler)?
I had to completely rip apart my handler to get the client blocking call to work without hanging. The issue was mostly resolved by not holding onto local channel ref within the handler.
Now we just pass a ConnectionInterface#openConnection() [returns a new ChannelFuture] into the shared custom ChannelHandler#call( ConnectionInterface connectionInterface, HttpRequest request ).
Better to open-channel within the handler call method, and to pass that channel along with checks on its state before channel.write(x), if !channel.isWritable() then recycle the channel (from a new client connection eg. ConnectionInterface#openConnection()) and retry the write. There isn't even a need to close the channel (it gets handled in the pool).
Just ran it with 500 threads / 5000 requests and it succeeds fine.

HTTP Keep Alive in a large Web Applications

I have a web application deployed over IIS 7.0. the application is accessible by large number of users and manipulates large data ..my question is concerning the HTTP Keep-Alive option which is set to true by default.
is it a better approach to set the HTTP Keep-Alive to false or true.
in case of true is the good approach to use time out?
KeepAlive should normally be used to handle the requests that immediately follow an HTML request. Let's say on the first visit to your site I get an HTML page with 5 css, 5js and 25 images, I will use my HTTP connection which is still alive to request these things (well, depends on the browser, I'll maybe use 3 connection to speed up these things).
To handle this fact we usually use a Keepalive of 2s or 3s. Having a longer keepalive means the connection is waiting for the next page that the user may request. This may be a valid way of thinking, next time the user will want a page, we'll avoid to loose time establishing HTTP connection (and this can be the longest part of the request/response time). But for your server that mean most of HTTP connection that are handled by the server are doing... nothing. And you will reach your MaxConnection (W3SVC/MaxConnections with a ridiculous default to 10), with connections doing nothing. Really Bad. So long keep-alive needs big webservers and should be used only if your application really needs it.
If you use Keepalive in a 'classical website' you must change the connection timeout (by default 2min). In Apache you would have 2 settings, a keepalive tiemout (5s by default) and a connection timeout (2min). In IIS seems the timeout settings is used for both. So do not set it to 2s (a client really slow in sending his request will timeout), but something like 10s is maybe enough. Now one response is to disallow Keep-Alive, and make the browser opening more connections. Another response is to use a modern webserver (like nginx or cherokee for example) which handles keep-alive connection in a more elegant and resource-free way than Apache or IIS.
Even if you do not use Keepalive, what's the reason of waiting 2 minutes for a client timeout? it's is certainly too high, decrease this value to something like 60s.
Then you should check several settings related to timeout (ConnectionTimeout, HeaderWaitTimeout, MinFileBytesPerSec) and this nice response on performances settings in the registry.
This article will bring more insight and don't forget to check the "How do we fix it?" section
http://mocko.org.uk/b/2011/01/23/http-keepalive-considered-harmful/
I think that it's not a good idea to get all users connected.
Because of:
User just can open your site, but not use it - why we shoud keep connection for long time?
It's hard to keep much connection (more memory)
Use connection time-out (max 5 min will we ok)
BUT: if your application is a live chat - you should kepp alive all connection. In this way better to use Ajax Long Polling Request + Node JS + some fast nosql db to store chat messages.

Relation between HTTP Keep Alive duration and TCP timeout duration

I am trying to understand the relation between TCP/IP and HTTP timeout values. Are these two timeout values different or same? Most Web servers allow users to set the HTTP Keep Alive timeout value through some configuration. How is this value used by the Web servers? is this value just set on the underlying TCP/IP socket i.e is the HTTP Keep Alive timeout and TCP/IP Keep Alive Timeout same? or are they treated differently?
My understanding is (maybe incorrect):
The Web server uses the default timeout on the underlying TCP socket (i.e. indefinite) regardless of the configured HTTP Keep Alive timeout and creates a Worker thread that counts down the specified HTTP timeout interval. When the Worker thread hits zero, it closes the connection.
EDIT:
My question is about the relation or difference between the two timeout durations i.e. what will happen when HTTP keep-alive timeout duration and the timeout on the Socket (SO_TIMEOUT) which the Web server uses is different? should I even worry about these two being same or not?
An open TCP socket does not require any communication whatsoever between the two parties (let's call them Alice and Bob) unless actual data is being sent. If Alice has received acknowledgments for all the data she's sent to Bob, there's no way she can distinguish among the following cases:
Bob has been unplugged, or is otherwise inaccessible to Alice.
Bob has been rebooted, or otherwise forgotten about the open TCP socket he'd established with Alice.
Bob is connected to Alice, and knows he has an open connection, but doesn't have anything he wants to say.
If Alice hasn't heard from Bob in awhile and wants to distinguish among the above conditions, she can resend her last byte of data, wrapped in a suitable TCP frame to be recognizable as a retransmission, essentially pretending she hasn't heard the acknowledgment. If Bob is unplugged, she'll hear nothing back, even if she repeatedly sends the packet over a period of many seconds. If Bob has rebooted or forgotten the connection, he will immediately respond saying the connection is invalid. If Bob is happy with the connection and simply has nothing to say, he'll respond with an acknowledgment of the retransmission.
The Timeout indicates how long Alice is willing to wait for a response when she sends a packet which demands a reply. The Keepalive time indicates how much time she should allow to lapse before she retransmits her last bit of data and demands an acknowledgment. If Bob goes missing, the sum of the Keepalive and Timeout values will indicate the worst-case time between Alice receiving her last bit of data and her deciding that Bob is dead.
They're two separate mechanisms; the name is a coincidence.
HTTP keep-alive (also known as persistent connections) is keeping the TCP socket open so that another request can be made without setting up a new connection.
TCP keep-alive is a periodic check to make sure that the connection is still up and functioning. It's often used to assure that a NAT box (e.g., a DSL router) doesn't "forget" the mapping between an internal and external ip/port.
KeepAliveTimeout Directive
Description: Amount of time the server will wait for subsequent
requests on a persistent connection Syntax: KeepAliveTimeout seconds
Default: KeepAliveTimeout 15 Context: server config, virtual host
Status: Core Module: core The number of seconds Apache will wait for a
subsequent request before closing the connection. Once a request has
been received, the timeout value specified by the Timeout directive
applies.
Setting KeepAliveTimeout to a high value may cause performance
problems in heavily loaded servers. The higher the timeout, the more
server processes will be kept occupied waiting on connections with
idle clients.
In a name-based virtual host context, the value of the first defined
virtual host (the default host) in a set of NameVirtualHost will be
used. The other values will be ignored.
TimeOut Directive
Description: Amount of time the server will wait for certain events
before failing a request Syntax: TimeOut seconds Default: TimeOut 300
Context: server config, virtual host Status: Core Module: core The
TimeOut directive currently defines the amount of time Apache will
wait for three things:
The total amount of time it takes to receive a GET request. The amount
of time between receipt of TCP packets on a POST or PUT request. The
amount of time between ACKs on transmissions of TCP packets in
responses. We plan on making these separately configurable at some
point down the road. The timer used to default to 1200 before 1.2, but
has been lowered to 300 which is still far more than necessary in most
situations. It is not set any lower by default because there may still
be odd places in the code where the timer is not reset when a packet
is sent.

Detecting HTTP close using inet

In my mochiweb application, I am using a long held HTTP request. I wanted to detect when the connection with the user died, and I figured out how to do that by doing:
Socket = Req:get(socket),
inet:setopts(Socket, [{active, once}]),
receive
{tcp_closed, Socket} ->
% handle clean up
Data ->
% do something
end.
This works when: user closes his tab/browser or refreshes the page. However, when the internet connection dies suddenly (say wifi signal lost all of a sudden), or when the browser crashes abnormally, I am not able to detect a tcp close.
Am I missing something, or is there any other way to achieve this?
There is a TCP keepalive protocol and it can be enabled with inet:setopts/2 under the option {keepalive, Boolean}.
I would suggest that you don't use it. The keep-alive timeout and max-retries tends to be system wide, and it is optional after all. Using timeouts on the protocol level is better.
The HTTP protocol has the status code Request Timeout which you can send to the client if it seems dead.
Check out the after clause in receive blocks that you can use to timeout waiting for data, or use the timer module, or use erlang:start_timer/3. They all have different performance characteristics and resource costs.
There isn't a default "keep alive" (but can be enabled if supported) protocol over TCP: in case there is a connection fault when no data is exchanged, this translates to a "silent failure". You would need to account for this type of failure by yourself e.g. implement some form of connection probing.
How does this affect HTTP? HTTP is a stateless protocol - this means that every request is independent of every other. The "keep alive" functionality of HTTP doesn’t change that i.e. "silent failure" can still occur.
Only when data is exchanged can this condition be detected (or when TCP Keep Alive is enabled).
I would suggest sending the application level keep alive messages over HTTP chunked-encoding. Have your client/server smart enough to understand the keep alive messages and ignore them if they arrive on time or close and re-establish the connection again.

Resources