https and websocket handler with different deadlines - http

I have a toy proxy server that accepts connections on a port. I set some deadlines for read/write operations to avoid having too many idle connections from bad clients that fail to close properly.
The problem is that I would like to set a higher deadline for connections that are towards websockets (wss in particular). For plain http requests I can see the 101 Switching Protocols response but https/wss is trickier since I mostly do a io.CopyBuffer from src connection to dst connection and I don't see anything "websocket related" in the initial proxy connection in order to differentiate between https and wss and apply the proper deadline.
I've included a debug screen to such a request towards a wss:// demo server.
Any ideas?

One cannot reliably distinguish between "normal" HTTP traffic and Websockets just from looking at the encrypted data.
One can try to do some heuristics by looking at traffic patterns, i.e. in which direction how many data are transferred in which time and with which idle times between data. Such heuristics can be based on the assumption that HTTP is a request + response protocol with typically small requests shortly followed by larger responses, while Websockets can show arbitrary traffic patterns.
But arbitrary traffic patterns also means that Websockets might be used in a request + response way too. (including request + response though). Also, in some use cases the usage pattern for HTTP consists of mostly larger requests followed by small responses. Thus depending on the kind of application such heuristics might be successful or they might fail.

It is always a good practice to define global Server timeout to make sure that resources are not locked forever. That timeout should be not smaller that longest timeout in all handlers.
DefaultServer = &http.Server{
Handler: http.TimeoutHandler(handler, wssTimeout, timeoutResponse),
...
}
In handler that processes http and wss requests, we need to set timeout dynamically.
func (proxy *ProxyHttpServer) handleHttps(w http.ResponseWriter, r *http.Request) {
// Request Context is going to be cancelled if client's connection closes, the request is canceled (with HTTP/2), Server we created above time outed.
// all code down the stack should respect that ctx.
ctx := r.Context()
timeoit := httpTimeout
if itIsWSS(r) {
timeout = wssTimeout
}
ctx, cancel = cWithTimeout(ctx, timeout)
defer cancel()
// all code below to use ctx instead of context.Backgound()/TODO()

Related

Does Golang's (*http.ResponseWriter) Write() method block until data is recieved by the client?

I ask this question because I had a very weird puzzling experience that I am about to tell.
I am instrumenting an HTTP API server to observe it's behavior in the presence of latency between the server and the clients. I had a setup consisting of a single server and a dozen of clients connected with a 10Gbps Ethernet fabric. I measured the time it took to serve certain API requests in 5 scenarios. In each scenario, I set the latency between the server and the clients to one of the values: No latency (I call this baseline), 25ms, 50ms, 250ms or 400ms using the tc-netem(8) utility.
Because I am using histogram buckets to quantify the service time, I observed that all the requests were processed in less than 50ms whatever the scenario is, which clearly doesn't make any sense as, for example, in the case of 400ms, it should be at least around 400ms (as I am only measuring the duration from the moment the request hits the server to the moment the HTTP Write()function returns). Note that the response objects are between 1Kb to 10Kb in size.
Initially, I had doubts that the *http.ResponsWriter's Write() function was asynchronous and returns immediately before data is received by the client. So, I decided to test this hypothesis by writing a toy HTTP server that services the content of a file that is generated using dd(1) and /dev/urandom to be able to reconfigure the response size. Here is the server:
var response []byte
func httpHandler(w http.ResponseWriter, r * http.Request) {
switch r.Method {
case "GET":
now: = time.Now()
w.Write(response)
elapsed: = time.Since(now)
mcs: = float64(elapsed / time.Microsecond)
s: = elapsed.Seconds()
log.Printf("Elapsed time in mcs: %v, sec:%v", mcs, s)
}
}
func main() {
response, _ = ioutil.ReadFile("BigFile")
http.HandleFunc("/hd", httpHandler)
http.ListenAndServe(":8089", nil)
}
Then I start the server like this:
dd if=/dev/urandom of=BigFile bs=$VARIABLE_SIZE count=1 && ./server
from the client side, I issue time curl -X GET $SERVER_IP:8089/hd --output /dev/null
I tried with many values of $VARIABLE_SIZE from the range [1Kb, 500Mb], using an emulated latency of 400ms between the server and each one of the clients. To make long story short, I noticed that the Write() method blocks until the data is sent when the response size is big enough to be visually noticed (on the order of tens of megabytes). However, when the response size is small, the server doesn't report a mentally sane servicing time compared to the value reported by the client. For a 10Kb file, the client reports 1.6 seconds while the server reports 67 microseconds (which doesn't make sense at all, even me as a human I noticed a little delay on the order of a second as it is reported by the client).
To go a little further, I tried to find out starting from which response size the server returns a mentally acceptable time. After many trials using a binary search algorithm, I discovered that the server always returns few microseconds [20us, 600us] for responses that are less than 86501 bytes in size and returns expected (acceptable) times for requests that are >= 86501 bytes (usually half of the time reported by the client). As an example, for a 86501 bytes response, the client reported 4 seconds while the server reported 365 microseconds. For 86502 bytes, the client reported 4s and the sever reported 1.6s. I repeated this experience many times using different servers, the behavior is always the same. The number 86502 looks like magic !!
This experience explains the weird observations I initially had because all the API responses were less than 10Kb in size. However, this opens the door for a serious question. What the heck on earth is happening and how to explain this behavior ?
I've tried to search for answers but didn't find anything. The only thing I can think about is maybe it is related to Linux's sockets size and whether Go makes the system call in a non-blocking fashion. However, AFAIK, TCP packets transporting the HTTP responses should all be acknowledged by the receiver (the client) before the sender (the server) can return ! Breaking this assumption (as it looks like in this case) can lead to disasters ! Can someone please provide an explanation for this weird behavior ?
Technical details:
Go version: 12
OS: Debian Buster
Arch: x86_64
I'd speculate the question is stated in a wong way in fact: you seem to be guessing about how HTTP works instead of looking at the whole stack.
The first thing to consider is that HTTP (1.0 and 1.1, which is the standard version since long time ago) does not specify any means for either party to acknowledge data reception.
There exists implicit acknowledge for the fact the server received the client's request — the server is expected to respond to the request, and when it responds, the client can be reasonably sure the server had actually received the request.
There is no such thing working in the other direction though: the server does not expect the client to somehow "report back" — on the HTTP level — that it had managed to read the whole server's response.
The second thing to consider is that HTTP is carried over TCP connections (or TLS, whcih is not really different as it uses TCP as well).
An oft-forgotten fact about TCP is that it has no message framing — that is, TCP performs bi-directional transfer of opaque byte streams.
TCP only guarantees total ordering of bytes in these streams; it does not in any way preserve any occasional "batching" which may naturally result from the way you work with TCP via a typical programming interface — by calling some sort of "write this set of bytes" function.
Another thing which is often forgotten about TCP is that while it indeed uses acknowledgements to track which part of the outgoing stream was actually received by the receiver, this is a protocol detail which is not exposed to the programming interface level (at least not in any common implementation of TCP I'm aware of).
These features mean that if one wants to use TCP for message-oriented data exchange, one needs to implement support for both message boundaries (so-called "framing") and acknowledgement about the reception of individual messages in the procotol above TCP.
HTTP is a protocol which is above TCP but while it implements framing, it does not implement explicit acknowledgement besides the server responding to the client, described above.
Now consider that most if not all TCP implementations employ buffering in various parts of the stack. At least, the data which is submitted by the program gets buffered, and the data which is read from the incoming TCP stream gets buffered, too.
Finally consider that most commonly used TCP implementations provide for sending data into an active TCP connection through the use of a call allowing to submit a chunk of bytes of arbitrary length.
Considering the buffering described above, such a call typically blocks until all the submitted data gets copied to the sending buffer.
If there's no room in the buffer, the call blocks until the TCP stack manages to stream some amount of data from that buffer into the connection — freeing some room to accept more data from the client.
What all of the above means for net/http.ResponseWriter.Write interacting with a typical contemporary TCP/IP stack?
A call to Write would eventially try to submit the specified data into the TCP/IP stack.
The stack would try to copy that data over into the sending buffer of the corresponding TCP connection — blocking until all the data manages to be copied.
After that you have essentially lost any control about what happens with that data: it may eventually be successfully delivered to the receiver, or it may fail completely, or some part of it might succeed and the rest will not.
What this means for you, is that when net/http.ResponseWriter.Write blocks, it blocks on the sending buffer of the TCP socket underlying the HTTP connection you're operating on.
Note though, that if the TCP/IP stack detects an irrepairable problem with the connection underlying your HTTP request/response exchange — such as a frame with the RST flag coming from the remote part meaning the connection has been unexpectedly teared down — this problem will bubble up the Go's HTTP stack as well, and Write will return a non-nil error.
In this case, you will know that the client was likely not able to receive the complete response.

Proper way to mux an HTTP server in Go that can handle non-HTTP protocols

I have built a router with extended logging capabilities using Go. It works properly for most use cases. However, it encounters problems when clients send non-standard HTTP messages on port 80.
To date, I have solved this by implementing my own version of ServeHTTP():
func (myproxy *MyProxy) ServeHTTP(w http.ResponseWtier, r *http.Request) {
// Inspect headers
// Determine if it is a custom protocol (ie: websockets, CONNECT requests)
// Implement handlers for each time
}
In the event that I determine a request is a non-standard HTTP protocol, the request is played back to the original destination (via http.Request.Write()) and everyone is happy.
At least, most of the time. The problem arises with edge cases. Tivo clients do not send "Host" headers and appear to not like all kinds of other standard things that Go does (such as capitalizing header names). The number of possible variations on this is endless so what I would very much like to do is to buffer the original request - exactly as it was sent to me without any modifications - and replay it back to the original destination server.
I could hack this by re-implementing net.Http.Server, but this seems like an extraordinarily difficult and brittle approach. Would I would prefer to do would be to somehow hook into net/http/http/server.go right around the point where it receives a connection, then wrap that in a spy wrapper that logs the request to a buffer.
func (srv *Server) Serve(l net.Listener) error {
// Lots of listener code...
c := srv.newConn(rw)
c.setState(c.rwc, StateNew) // before Serve can return
// I'd like to wrap c here and save the request for possible reply later.
go c.serve(ctx)
}
https://golang.org/src/net/http/server.go?s=93229:93284#L2805
Edit: I have looked at httpUtil:DumpRequest but this slightly modifies the original request (changes case and order of headers). If it didn't do that, it would be an ideal solution.
https://godoc.org/net/http/httputil#DumpRequest
Is there a way to hook connections around this point before they are parsed by http.Request?
In the interest of helping others, I wanted to answer my question. The approach I suggested above does in fact work and is the proper way to do this. In summary:
Implement ListenAndServe()
Wrap the incoming net.Conn in a TeeReader or other multiplexed connection wrapper
Record the request
Dial the original destination and connect with the inbound connection, replaying the original request if necessary.
A similar use case is required when upgrading connection requests for websockets servers. A nice writeup can be found here:
https://medium.freecodecamp.org/million-websockets-and-go-cc58418460bb

Bing API HTTP request timeout

I'm using Bing V7 API and sending an HTTP requests for this endpoint:
https://api.cognitive.microsoft.com/bing/v7.0/search
When I'm define my HttpClient, I need to select the right Timeout value. To short timeout, will makes me loose some answers from the server. Too long timeouts, will make me wait, even if the server is not there.
I looked on Bing documentation and didn't find the right value.
What is the right HTTP request timeout for this calls?
What is the right HTTP request timeout for this calls?
I'm using Bing V7 API and sending an HTTP requests for this endpoint:
Bing provides their API via a HTTP endpoint. This has nothing really to do with the API itself in my opinion as HTTP is just the transport in this situation. HTTP request context is normally handled by eg. reverse proxies such as NGINX (or likely MS IIS here). Hence no documentation in the API docs.
When I'm define my HttpClient, I need to select the right Timeout value. To short timeout, will makes me loose some answers from the server. Too long timeouts, will make me wait, even if the server is not there.
The timeout value in your HttpClient is just ment to eventually recover from a blocking situation. This means that your program won't block indefinitely, but will at some point terminate the HTTP action at hand. This is useful if your HttpClient got into a eg. network split situation, deadlock or similar situation and no reply will ever come.
A timeout value between 45 to 60 seconds is plenty.
Too long timeouts, will make me wait, even if the server is not there.
I would keep the HttpClient timeout value at a fixed eg. 60 seconds and have a second "supervisor" thread doing some more dynamic "Smoke test" to check if connectivity is ok or if there is some other problem at which point you then can terminate HttpClient early.

Creating an idle timeout in Go?

I use CloudFlare for one of my high volume websites, and it sits in front of my stack.
The thing is CloudFlare leaves idle connections open in addition to creating new ones, and it's not a setting I can change.
When I have Varnish or Nginx sitting in front listening on port 80 they have out of the box configuration to hang up the idle connections.
This is fine until I had to add a proxy written in Go to the front of my stack. It uses the net/http standard library.
I'm not a Go wizard but based on what people are telling me there are only read and write timeout settings but not hanging up idle connections.
Now my server will fill up with connections and die unless I set a set read and write timeouts, but the problem with this is my backend takes a long time sometimes and it's causing good requests to be cut off when they shouldn't.
What is the proper way to handle idle connections with Go http?
Edit 1: To be more clear, I'm using httputil.NewSingleHostReverseProxy to construct a proxy, which exposes transport options but only for the upstream. The problems I am having are downstream, they need to be set on the http.Server object that uses the ReverseProxy as a handler. http.Server does not expose transport.
Edit 2: I would prefer an idle timeout to a read timeout, since the later would apply to an active uploader.
Thanks
The proper way to hangup idle connections in the Go http server is to set the read timeout.
It is not necessary to set the write timeout to hang up on idle clients. Don't set this value or adjust it up if it's cutting off responses.
If you have long uploads, then use a connection state callback to implement separate idle and read timeouts:
server.ConnState = func(c net.Conn, cs http.ConnState) {
switch cs {
case http.StateIdle, http.StateNew:
c.SetReadDeadline(time.Now() + idleTimeout)
case http.StateActive:
c.SetReadDeadline(time.Now() + activeTimeout)
}
}
See the net/http.Transport docs. The Transport type has some options for dealing with idle HTTP connections in the keep-alive state. From reading your question, the option that seems most relevant to your problem is the MaxIdleConnsPerHost field:
MaxIdleConnsPerHost, if non-zero, controls the maximum idle (keep-alive) to keep per-host. If zero, DefaultMaxIdleConnsPerHost is used.
Reading the code, the default is 2 per host.
The Transport type also has a method to zap all idle connections: CloseIdleConnections.
CloseIdleConnections closes any connections which were previously connected from previous requests but are now sitting idle in a "keep-alive" state. It does not interrupt any connections currently in use.
You can specify a Transport on any http client:
tr := &http.Transport{
TLSClientConfig: &tls.Config{RootCAs: pool},
DisableCompression: true,
MaxIdleConnsPerHost: 1,
}
client := &http.Client{Transport: tr}
resp, err := client.Get("https://example.com")
Another thing worth noting: the docs recommend that you keep a single http client object that is re-used across all your requests (i.e. like a global variable).
Clients and Transports are safe for concurrent use by multiple goroutines and for efficiency should only be created once and re-used.
If you are creating many http client objects in your proxy implementation, it might explain unbounded growth of idle connections (just guessing at how you might be implementing this, though).
EDIT: Reading a little bit more, the net/httputil package has some convenience types for reverse proxies. See the ReverseProxy type. That struct also allows you to supply your own Transport object, allowing you to control your proxy's idle client behavior via this helper type.

Detecting missing responses to long running HTTP (SOAP) requests

I need a way to detect a missing response to a long running HTTP POST request. This problem arises when the network infrastructure (firewalls, proxies, unplugged cables, etc.) drops the response packets. The server may detect this failure, but the client cannot send additional bytes after the POST to probe the state of the TCP connection. The failure may be limited to a single TCP connection. For example I may be able to subsequently open a new TCP connection to the server.
I'm looking for a solution that still uses HTTP POST and does not change the duration of the server side processing.
Some solutions that I can think of are:
Provide a side channel interface to retrieve request & response history. If the history lists the response as having been send (presumably resulting in a TCP error) but I have not yet received it within a reasonable time I can generate a local error.
Use an X header to request that the server deliver "spurious" 100 Continue provisional responses on a regular interval. If I fail to see an expected 100 Continue or a non-provisional response I can generate a local error.
Is there a state of the art solution for this problem?
It sounds to me like you are using Soap for something that would be much better done using a stateful connection, or a server side push technology.

Resources