HTTP proxy SSL tunneling relay details - http

I am trying to wrap my head around the ssl Tunneling process which is performed by an http proxy after receiving the CONNECT method from a client.
Stuff I can't seem to find or understand in docs, blogs, rfcs:
1) when setting up the tunnel, are the two connections from client-proxy and proxy-destination two separate connections or just one and the same? E.g. is there an tcp handshake between client-proxy and another between proxy-destination?
2) when starting the ssl handshake what node is targeted (ip address/hostname) by the client? The proxy or the destination host? Since ssl requires a point-to-point connection to make the authentication work my feeling tells me it should be the destination host. But then again that wouldn't make sense since the destination host isn't (directly) accessible from the clients perspective (hence the proxy).

when setting up the tunnel, are the two connections from client-proxy and proxy-destination two separate connections or just one and the same? E.g. is there an tcp handshake between client-proxy and another between proxy-destination?
Since the client makes the TCP connection to the proxy there is no other way than that the proxy is making another TCP connection to the server. There is no way to change an existing TCP connection to be connected to a different IP:port.
when starting the ssl handshake what node is targeted (ip address/hostname) by the client? The proxy or the destination host?
The SSL handshake is done with the destination host, not the proxy.
Since ssl requires a point-to-point connection to make the authentication
It doesn't need a point-to-point connection. It just needs that all data gets exchanged unmodified between client and server which is the case when the proxy simply forwards the data.

Related

Why do we need port number in HOST header of HTTP when we have port number in TCP packet?

I'm aware of how HOST header can help us having multiple websites on a single IP address. In HOST header, we can optionally specify "port number". (80 by default for HTTP)
In OSI model, layer-4 is responsible for dealing with "ports" and after reassembling the packets, it can hand them to the correct application/process.
On the other hand, HTTP works in layer-7 of OSI. So on that point, I think the application already received the correct packet and knows the port number.
Then why the HOST header have this "port number" part and how can this "port" of HOST header help us?
Also I want to know that if they are different or can be different ?
The port in the URL is the same port that gets used for the TCP connection and it is the same port that's in the host header.
The protocol is kind of Layer 5/6 but definitely not Layer 7. You might be able to argue it is Layer 6, but probably not if it's encrypted, in which case TLS would be l5 and http l6.
Adding the port allows the session layer to instruct the OS what port to use.
For some L5 protocols the application knows the default port, eg http(80) https(443) ftp(21).
But when you want to run one of those L5 sessions over a different L4 connection the user needs a way to instruct the TCP stack to do this. So the designers of http decided to allow an optional TCP port at the end of the URL.
The port in the host header tells you which endpoint your clients connected to. Eg abc.com:80 and abc.com:81 are different Endpoints, but they could be connected to the same server instance.
While it's true that a server can work out which port a user is connected to by looking at the socket, the server implementation might not support this or it might be necessary to retain it in the future.
If your server needs the port on the host header becomes a question of implementation and needs.

HAproxy single-arm loadbalancing

I am trying to setup a loadbalancing lab for HAproxy in single-arm mode (when actual frontend IP and backend servers reside in same subnet, while actual clients are always remote). Another request is to make client source IPs visible to backend nodes. As we load-balance custom tcp-based app, it seems that option 'source 0.0.0.0 usesrc clientip' is a right choice here. Also, I have configured backends to have default-gateways pointing to HAproxy's IP address.
Although strange things happen once I enable this backend option: I see connection to frontend VIP was properly done and 3-way handshake formed. But when HAproxy server is trying to build a 2nd session to reach out to backend servers with spoofed IP of a client, I see exactly this happening:
Proxy is sending SYN with spoofed Client's IP address to one of the backends;
Backend is normally repsonds with SYN-ACK packet;
Proxy is NOT sending last ACK, just blindly sends SYN packets after timeout with same outcome;
On a proxy I see this connection is marked as SYN_SENT in netstat output, so it looks like proxy server doesn't accept actualy SYN-ACK packet for some reason.
Any comment would be appreciated.
The source option makes HAProxy bind to a specific IP address before it relays the request to the server. If you just need to load balance servers over TCP/IP (not HTTP), then you do not need this.
Set mode tcp in your frontend and backend, which enables load balancing of TCP-enabled applications.
To forward the client's IP address to the server, can you modify your custom application to support the Proxy Protocol? https://www.haproxy.com/blog/using-haproxy-with-the-proxy-protocol-to-better-secure-your-database/

Difference between HTTP(s) Reverse Proxy, TCP Proxy, Socks5 Proxy?

Here are my understandings about these and I see few gaps there; especially when and where to use
HTTP(s) proxy:
Can be used as TLS termination proxy
Can be used to modify HTTP headers
Can be used as a load balancer or a public IP provider in front of DMZ to shield backend servers
TCP Proxy
Can be used as reverse proxy for TCP connections and can support not only HTTP but also other application layer protocols such as FTP
My question(s)
If I only accept HTTP web traffic what are the use cases where we should use TCP proxy instead of HTTP Proxy
Is this understanding connect? TCP clients can connect to a single socket on TCP proxy and TCP Proxy can open up multiple connections to the backend servers something similar load balancers
SOCKS5 Proxy
From Wikipedia
Socket Secure (SOCKS) is an Internet protocol that exchanges network packets between a client and server through a proxy server. SOCKS5 additionally provides authentication so only authorized users may access a server. Practically, a SOCKS server proxies TCP connections to an arbitrary IP address, and provides a means for UDP packets to be forwarded.
SOCKS performs at Layer 5 of the OSI model (the session layer, an intermediate layer between the presentation layer and the transport layer). SOCKS server accepts incoming client connection on TCP port 1080
My questions
What is the use of SOCKS proxy in an web application
Difference between TCP and SOCKS5 proxy
In TCP/IP model is it a transport layer protocol
What are the use cases for proxying UDP connections
If I only accept HTTP web traffic what are the use cases where we should use TCP proxy instead of HTTP Proxy
A TCP proxy terminates the incoming TCP socket, opens outbound socket and moves data in between. It doesn't/can't change the data in between since it doesn't understand any of it. Most often, a TCP proxy is statically configured and can only create connections to a single host:port combination.
An HTTP proxy understands HTTP. It looks at the incoming HTTP request and uses an outbound, potentially changed HTTP request to fulfill the request. The proxy can read the HTTP request's host address and connect to multiple hosts that way. It is aware of the HTTP application level which a TCP proxy isn't. Some HTTP proxies can even fulfill FTP or HTTPS requests for clients just using HTTP.
A "forward" proxy is a proxy connecting from private to public IP space (which was the original idea for a proxy) while a "reverse" proxy connects from public to private IP (e.g. mapping to multiple web servers from a single, public IP). Technically, it's the same, but from the security POV there's a huge difference (in "forward" you trust the clients, in "reverse" you trust the servers).
Is this understanding connect? TCP clients can connect to a single socket on TCP proxy and TCP Proxy can open up multiple connections to the backend servers something similar load balancers
Yes.
Difference between TCP and SOCKS5 proxy
SOCKS5 is a general proxy protocol that can do more than a TCP proxy, including one-to-many connections, listening ports, and UDP.
In TCP/IP model is it a transport layer protocol
To me, SOCKS5 is an application layer protocol to arbitrate a transport protocol connection. Some argue that SOCKS5 is a session layer protocol in between transport and application layer - that holds some truth but the session layer is ill-defined in TCP/IP.
What are the use cases for proxying UDP connections
For instance, SOCKS5 can be used for private-to-public Internet access or for (insecure) public-to-private LAN access.

How do browsers detect which HTTP response is theirs?

Given that you have multiple web browsers running, all which obviously listen on port 80, how would a browser figure if an incoming HTTP response was originated by itself? And whether or not catch the response and show it?
As part of the connection process a TCP/IP connection is assigned a client port. Browsers do not "listen on port 80"; rather a browser/clients initiate a request to port 80 on the server and waits for a reply on the client port from the server's IP.
After the client port is assigned (locally), each client [TCP/IP] connection is uniquely identified by (server IP, server port, client IP, client port) and the connection (and response sent over such) can be "connected back" to the correct browser. This same connection-identifying tuple is how a server doesn't confuse multiple requests coming from the same client/IP1
HTTP sits on top of the TCP/IP layer and doesn't have to concern itself with mixing up connection streams. (HTTP/2 introduces multiplexing, but that is a different beast and only affects connection from the same browser.)
See The Ephemeral Port Range for an overview:
A TCP/IPv4 connection consists of two endpoints, and each endpoint consists of an IP address and a port number. Therefore, when a client user connects to a server computer, an established connection can be thought of as the 4-tuple of (server IP, server port, client IP, client port). Usually three of the four are readily known -- client machine uses its own IP address and when connecting to a remote service, the server machine's IP address and service port number are required [leaving only the client port unknown and to be automatically assigned].
What is not immediately evident is that when a connection is established that the client side of the connection uses a port number. Unless a client program explicitly requests a specific port number, the port number used is an ephemeral port number. Ephemeral ports are temporary ports assigned by a machine's IP stack, and are assigned from a designated range of ports for this purpose. When the connection terminates, the ephemeral port is available for reuse, although most IP stacks won't reuse that port number until the entire pool of ephemeral ports have been used. So, if the client program reconnects, it will be assigned a different ephemeral port number for its side of the new connection.
See TCP/IP Client (Ephemeral) Ports and Client/Server Application Port Use for an additional gentle explanation:
To know where to send the reply, the server must know the port number the client is using. This [client port] is supplied by the client as the Source Port in the request, and then used by the server as the destination port to send the reply. Client processes don't use well-known or registered ports. Instead, each client process is assigned a temporary port number for its use. This is commonly called an ephemeral port number.
1 If there are multiple client computers (ie. different TCP/IP stacks each assigning possibly-duplicate ephemeral ports) using the same external IP then something like Network Address Translation must be used so the server still has a unique tuple per connection:
Network address translation (NAT) is a methodology of modifying network address information in Internet Protocol (IP) datagram packet headers while they are in transit across a traffic routing device for the purpose of remapping one IP address space into another.
thank you all for answers.
the hole listening thing over port 80 was my bad,I must have been dizzy last night :D
anyway,as I have read HTTP is connectionless.
browser initiates an HTTP request and after a request is made, the client disconnects from >the server and waits for a response. The server process the request and re-establish the >connection with the client to send response back.
therefor the browser does not maintain connection waiting for a response.so the answer is not that easy to just send the response back to the open socket.
here's the source
Pay attention browesers aren't listening on specific port to receive HTTP response. Web server listening on specific ports (usually 80 or 443). Browser open connection to web server, and send HTTP request to web server. Browser don't close connection before receive HTTP response. Web server writes HTTP response on opened connection.
Given that you have multiple web browsers running, all which obviously listen on port 80
Not obvious: just wrong. The HTTP server listens on port 80. The browsers connect to port 80.
how would a browser figure if an incoming HTTP response was originated by itself?
Because it comes back on the same connection and socket that was used to send the request.
And whether or not catch the response and show it?
Anything that comes back on the connected socket belongs to the guy who connected the socket.
And in any case all this is the function of TCP, not the browser.

Http 1.1 connection and client port

Does the client remote port changes during an HTTP 1.1 connection exchange?
I am trying to figure out if I can programmaticaly uniquely identify a connection on the server using the request remote port and remote ip address.
This is not as much an HTTP question, as it's a TCP one. And no, the port doesn't change: the ephemeral port stays the same for the duration of the connection.
However, as soon as a new connection is made, the client can (and most probably will) use a different port. This totally depends on the implementation of the client OS and the Network Address Translation of intermediary routers.
Anyway, it is not something you can depend on to build something like a session, because the next request from the same client may very well arrive from a different port (let alone that HTTP does not have to run on top of TCP).
Just use a session-ID which you store in a cookie.

Resources