Number of connections from a client to servers with proxy - http

I want to make simple HTTP proxy server.
Here, I have some problem of designing the program because of the number of connections.
When a client attempts to make connection to the 2 servers, there would be 2 connections; one from client to the server A and the other from client to the server B. It is natural; at least I think.
However, I'm confused when there is a proxy between client and server. I thought the client might make only 1 connection to the proxy, and send all of HTTP message (to server A and server B) via the connection. The first method is very natural (making 2 connections for 2 servers), but I want to double-check this before starting implementation!

Clients might make only one connection to your proxy server (using HTTP keepalive and/or pipelining to sequentially make more than one request through the same connection), or they might make multiple connections to your proxy server (especially if they want to make more than one HTTP request in parallel). You should be prepared for both eventualities because it's up to the client what it does.
The case of two HTTP requests coming over the same connection is semantically identical to the case of the same two HTTP requests coming over separate connections.

Related

Close HTTP request socket connection

I'm implementing HTTP over TLS proxy server (sni-proxy) that make two socket connection:
Client to ProxyServer
ProxyServer to TargetServer
and transfer data between Client and TargetServer(TargetServer detected using server_name extension in ClientHello)
The problem is that the client doesn't close the connection after the response has been received and the proxy server waits for data to transfer and uses resources when the request has been done.
What is the best practice for implementing this project?
The client behavior is perfectly normal - HTTP keep alive inside the TLS connection or maybe even a Websocket connection. Given that the proxy does transparent forwarding of the encrypted traffic it is not possible to look at the HTTP traffic in order to determine exactly when the connection can be closed. A good approach is therefore to keep the connection open as long as the resources allow this and on resource shortage close the connections which were idle (no traffic) the longest time.

What is the relationship between HTTP connection and a request?

While I am configuring my nginx, I found two modules: ngx_http_limit_conn_module and ngx_http_limit_req_module
one is for limiting connection per defined key, and one for limiting request.
My question is what is the relationship (and difference) between
a HTTP connection and a request.
It seems that multiple HTTP requests can use one common HTTP connection, what is the principle under this?
Basically connections are established to make requests using it. So for instance endpoint for given key may accept 5 connections per hour from given IP address. But it doesn't mean only 5 requests can be made but much more - if the connection is not closed after a request (from HTTP 1.1 it's by default kept alive).
E.g. an endpoint accepts 5 connections and 10 requests from given IP address. If connection is established for every request only 5 requests overall can be made. If connection is kept alive single client may make all the requests. If there are 5 clients, every establishes a connection and keeps it alive there are 2 request approx. that can be made by each client - however one can make all the request if it's fast enough.
HTTP connections - client and server introduce themselves.
HTTP requests - client ask something from server.
Making a connection with server involves TCP handshaking and it is basically creating a socket connection with the server. To make a HTTP request you should be already established a connection with the server. If you established connection with a server you can make multiple request using the same connection(HTTP/1.0 by default one request per connection, HTTP/1.1 by default it is keep alive). As most of the web pages need multiple resources from the server(ex: 100 photos to load in the screen). It is a low burden to the server if we keep the connection and request those 100 images using the same connection(No need to go through the connection establishment process 100 times). That is why HTTP/1.0 came up with keep alive as default.
A request is a functional execution: "Do something for me, and return the result back to me" - which is made by the client over a channel that the server is listening on, the "connection". Think of it as making a phone call to a restaurant. When the restaurant picks up the phone, you have an established "connection" - and now can place multiple requests over the same connection. The restaurant can handle multiple, simultaneous customer calls, if it has multiple phone lines open to receive the calls. This is your "connection pool" - at any point in time, you can only have as many simultaneous open connections (max) as the size of your connection pool. The number of requests however will vary. Some client may make 3 requests, and hang up, while other client may make 10 requests before hanging up.
The size of your connection pool determines concurrency - how many simultaneous clients can you talk to at any point in time? The length of those conversations will be use case specific.

Is WebSocket 'better' than HTTP when used as a simple stateless Web Service Server?

I've read some articles comparing the differences between WebSocket and the other push methods like Long polling. All the conclusions tend to be WebSocket is better then HTTP with low latency in the server and client bidirectional communication process.
But if server push is not a must, for example, a client game program just make a few queries to the server for some information, does it still better to use WebSocket then HTTP? More specially, I have two doubts here:
1. In a single Request-Response procedure, which is more efficency ? (I establish a WebSocket connection each time querying in the above case.)
2. Will the server capacity (The total number of clients that the server can serve) be affected by the unnecessary long-lived connection if I keep an WebSocket connection during the life cycle of the client?
Added Question:
3. Suppose there is only one TCP connection between the server and the client, will the stability of the connection go down and down as time flows?
The basic thing behind both the WebSocket and HTTP is the socket. In HTTP, it opens a connection on request and closes on response. For WebSocket, concept is a 2 way communication (full duplex) rather than request-response cycle.
Answers to your question:
Either you can use HTTP server or can create request-response design
using WebSocket
That's obvious. Each connection is a socket object. Server capacity
will be affected if we are not managing connections.
In WebSocket, it's using ping-pong mechanism to make sure that the client or
the server is alive. For every ping requests from one end, other end is
subjected to reply a pong response. This mechanism helps to detect failures and hence to maintain stability.

HTTP and Sessions

I just went through the specification of http 1.1 at http://www.w3.org/Protocols/rfc2616/rfc2616.html and came across a section about connections http://www.w3.org/Protocols/rfc2616/rfc2616-sec8.html#sec8 that says
" A significant difference between HTTP/1.1 and earlier versions of HTTP is that persistent connections are the default behavior of any HTTP connection. That is, unless otherwise indicated, the client SHOULD assume that the server will maintain a persistent connection, even after error responses from the server.
Persistent connections provide a mechanism by which a client and a server can signal the close of a TCP connection. This signaling takes place using the Connection header field (section 14.10). Once a close has been signaled, the client MUST NOT send any more requests on that connection. "
Then I also went through a section on http state management at https://www.rfc-editor.org/rfc/rfc2965 that says in its section 2 that
"Currently, HTTP servers respond to each client request without relating that request to previous or subsequent requests;"
A section about the need to have persistent connections in the RFC 2616 also said that prior to persistent connections every time a client wished to fetch a url it had to establish a new TCP connection for each and every new request.
Now my question is, if we have persistent connections in http/1.1 then as mentioned above a client does not need to make a new connection for every new request. It can send multiple requests over the same connection. So if the server knows that every subsequent request is coming over the same connection, would it not be obvious that the request is from the same client? And hence would this just not suffice to maintain the state and would this just nit be enough for the server to understand that the request was from the same client ? In this case then why is a separate state management mechanism required at all ?
Basically, yes, it would make sense, but HTTP persistent connections are used to eliminate administrative TCP/IP overhead of connection handling (e.g. connect/disconnect/reconnect, etc.). It is not meant to say anything about the state of the data moving across the connection, which is what you're talking about.
No. For instance, there might an intermediate (such as a proxy or a reverse proxy) in the request path that aggregates requests from multiple TCP connections.
See http://greenbytes.de/tech/webdav/draft-ietf-httpbis-p1-messaging-21.html#intermediaries.

HTTP client acting as a pseudo-server

Let's say I am going to deploy a server application that's likely to be placed behind a NAT/firewall and I don't want to ask users to tweak their NAT port mapping. In other words, connections to the server are impossible, but my app is a server application by nature, i.e. it sends back objects per URI.
Now, I'm thinking about initiating connections from the server periodically to see what requests are there to be responded to. I'm going to use HTTP via port 80 as something that would likely be working through NAT/firewall from virtually anywhere.
The question is, are there any standard considerations and common practices of implementing a client that can act as a server at the application level, specifically using HTTP? Any special HTTP headers? Design patterns?
E.g. I am thinking about the following scheme:
The client (which is my logical server) sends a dummy HTTP request to the server
The server responds back with non-standard headers X-Request-URI:, X-Host:, X-If-Modified-Since: etc, in other words, request headers wrapped into X-xxx as they are not standard in this situation; also requests to keep the connection alive
The client responds with a POST request that sends the requested object; again, uses wrapped headers (e.g. X-Status:, etc)
Unless there is a more "standard" way of doing something like this, do you think my approach is plausible?
Edit: an interesting discussion took place on reddit here
I've done something similar. This is very common. Client initiate the connection to the Server and keep the connection ALIVE. If the session is shut-down, client would re-initiate. When the session is up, Server can push anything to the client since it's client initiated.

Resources