Haproxy Appending Port to `HTTP_HOST` Header in Backend Request - http

I am using haproxy in front of my web-server for ssl termination.
I am forwarding request on port 81 if request is https and 80 if request is normal http-
backend b1_http
mode http
server bkend_server
backend b1_https
mode http
server bkend_server:81
Problem is, when haproxy sends request to back-end, it sends HTTP_HOST header as request.domain.com:81.
Is it possible in haproxy that I can send https request to back-end at specific port without appending the port in HTTP_HOST request header?

There are two issues, here.
First, there is no HTTP_HOST header. The header is Host:. It sounds like HTTP_HOST is something being generated internally by your web server or framework.
Second, HAProxy doesn't modify the Host: header just because your back-end is listening on a port other than 80. It doesn't actually modify the Host: header at all, unless explicit configured to, using a mechanism like reqirep ^Host: ... or http-request set-header host ....
You can confirm this with a packet capture. You should find that whatever HTTP_HOST is, the value is necessarily being generated internally on the back-end system itself, because it's not coming from HAProxy.

Related

Why is routing a request based on its Host header a good way to proxy traffic?

In the proxy documentation for Kong, it is mentioned that
Routing a request based on its Host header is the most straightforward
way to proxy traffic through Kong, especially since this is the
intended usage of the HTTP Host header
However, for this to work, any incoming request from a client must now have its Host header set to a particular value. In general, HTTP clients don't intentionally modify this value, so how is this used in practice?
In other words, clients aren't in general modifying the HTTP host header in their request, as is done in the curl examples in the docs, e.g.:
curl --url http://proxy.mydomain.com:8000/ --header 'Host: service.example.com'
Given that the proxy is intended to be transparent to clients, why is it the case that 'this is the intended usage of the HTTP Host header'?
If the proxy is transparent to the client, the client usually doesn't know that a proxy is used and therefore resolves the IP Address via DNS. The client then establishes a TCP connection to the IP address accordingly.
The (transparent) proxy now intercepts the traffic. The Host header is now the only chance to get the servers FQDN. This is important if connection is HTTPS so proxy can use the Host header value as SNI / Verify the Server's certificate.
Independent of the use of a transparent proxy, the host header should contain the server name which allows hosting multiple webpages using the same server HW.
Example:
Server IP 1.2.3.4 with 4 websites: www.a.com, www.b.com, www.c.com, www.d.com.
The client must provide the value of the website in the host header in order to allow the server to distinguish between the different websites.

How http proxy clients work

if an HTTP client reaches a website through a proxy (not reverse proxy) server, what are the actual HTTP request and its parameters that are sent from this client host to the internet?
for example:
Proxy Server: www.proxy.com:80
Target website: www.website.com:8081
Does the HTTP client send the following Get request?
Get http://www.proxy.com:80
Host: www.proxy.com:80
OR
Get http://www.website.com:8081
Host: www.website.com:8081
if the first case is true, How can the proxy know what is the actual destination to forward this request?
otherwise, if the second is true, how can the request actually reach the proxy host machine?
When you want to issue a GET request to http://www.example.com:8081/index.html, the browser connects to www.example.com:8081 and sends the following request:
GET /index.html HTTP/1.1
Host: www.example.com:8081
Now when a proxy is configured, say www.proxy.com:80, the browser will connect to www.proxy.com:80 instead, and issue the following request:
GET http://www.example.com:8081/index.html HTTP/1.1
Host: www.example.com:8081
So when a proxy is configured, the HTTP client connects to the proxy instead of to the target server, and sends the request using the absolute URI.
The client doesn't have to change the HTTP request for it to be sent to a proxy. It has to change the TCP headers.
The screenshot below shows a HTTP request sent from my browser to a proxy, as you can see nothing in the HTTP request itself specifies the proxy.
How this works is the browser/client will issue a HTTP GET request, which will then be forwarded to the TCP/IP stack and wrapped in a TCP header. The TCP header is where the destination is specified (proxy or otherwise).
Http proxy server can read http headers.
Whenever we use http proxy the destination address in the tcp packet(originating from client) has destination address of proxy server..
When the proxy server receives the tcp packet it can read the http headers(which is present in tcp packet payload) the http headers contains the actual destination for the packet.. using this information the http proxy server can forward the packet to actual destination.
Source : https://www.ibm.com/support/knowledgecenter/SSBLQQ_9.1.0/com.ibm.rational.ritpp.install.doc/topics/c_ritpp_advanced_proxy.html

Force Varnish to Use Proxy Protocol 1

I have HaProxy terminating SSL and passing the requests back to Varnish which then either serves the cached page or requests from Nginx. However, Varnish seems to be treating the request from HaProxy as HTTP/1 not HTTP/2 and failing to serve.
I can see in the Nginx logs the following when I try to hit a page:
" while reading PROXY protocol, client: 127.0.0.1, server: 127.0.0.1:8181
2016/08/11 06:53:31 [error] 5682#0: *1 broken header: "GET / HTTP/1.1
Host: www.example.com
User-Agent: curl/7.50.2-DEV
Accept: */*
X-Forwarded-For: IP_Removed
Accept-Encoding: gzip
X-Varnish: 32777
I've found something that relates to this here which states that the reason for this is that Nginx does not work with v2 PROXY only v1. So, as a result of this I've forced the use of protocol 1 in HaProxy using the send-proxy rather than send-proxy-v2 switch. But when it gets to Varnish I think that Varnish is converting this in some way to protocol 2 which is causing it to then fail to communicate properly with Nginx.
I have removed Varnish from the equation and connected HaProxy direct to Nginx and it works perfectly via HTTP/2. The problem is something is happening in the Varnish stack and the likely suspect is the proxy protocol v2 being used by Varnish.
So, to cut a long story short, how do I force Varnish to adhere to PROXY1 rather than PROXY2 protocol? I've tried adding PROXY1 into the launch daemon options but Varnish won't accept that. Any help is appreciated. Thanks!
UPDATE - I tested HaProxy > Nginx with the send-proxy-v2 switch on the HaProxy backend and it causes the identical problem to when Varnish is introduced into the stack. Switching back to send-proxy on HaProxy fixes the issue. So, I'm convinced that the issue is Varnish using protocol 2 rather than protocol 1. But how to tell it not to?
I understand that Varnish isn't HTTP/2 or does SSL but it should be passing the protocol back as is to Nginx no?
No.
But first, let's clarify. HTTP/2 and Proxy protocol V2 have absolutely nothing to do with each other. Remove HTTP/2 from your mind, as it is not applicable here in any sense.
Your question is, in fact, this:
If HAProxy is sending Proxy Protocol V1 to Varnish, and Nginx is configured behind Varnish to expect Proxy Protocol V1, why does Nginx complain of broken headers? Does Varnish not forward Proxy Protocol V1 to the backend? Does it for some reason send Proxy Protocol V2, instead?
And the answer to that question is that Varnish isn't sending either one. Neither V1 nor V2.
The only thing you need the Proxy protocol for is so that an HTTP-aware component can receive the client IP address (and port) from a upstream, non-HTTP-aware component, such as HAProxy using mode tcp or Amazon ELB with a listener in TCP mode, either of which is typically doing SSL offloading for you and not HTTP request routing, so it needs an alternative mechanism of passing the client address.
The first HTTP-aware component can take that address and set it in an HTTP header, customarily X-Forwarded-For, for the benefit of the remaining components in the stack. As such, there's no reason for Varnish to forward the Proxy protocol onward. It isn't doing that in your example, and there is no obvious reason why Varnish would even be capable of forwarding the Proxy protocol.¹
And this brings us to the error. You are misdiagnosing the problem that Nginx is reporting. The broken header error means that Nginx is receiving something other than Proxy protocol V1. With Varnish in the loop, there is no Proxy protocol header² present at all in the request to Nginx -- and when a listener is configured to expect the Proxy protocol header, that header is mandatory.
If a component is configured to expect Proxy protocol V1 and it is not present, that is always an error. But "not present" means exactly that. A V1 header is not present. That does not mean V2 is. It isn't.
So, I'm convinced that the issue is Varnish using protocol 2 rather than protocol 1.
You have convinced yourself incorrectly. Proxy V2 into Nginx -- as you have tried with HAProxy -- is an error, and no Proxy protocol header at all -- as you are seeing from Varnish -- is an error, as explained above. Both are misconfigurations, though of a different type. What you have done here is duplicated the error but for an entirely different reason.
If you are sending all requests through Varnish, then configure Varnish to set X-Forwarded-For in the forwarded request using the information it learns from the incoming Proxy protocol mesaage. Remove Proxy protocol from the Nginx configuration.
Or configure HAProxy to operate in HTTP mode and let it insert the header using option forwardfor.
¹ Clearly, from the error, Varnish is just sending ordinary HTTP headers -- nothing that looks like Proxy protocol. I don't think it even supports the option of sending Proxy protocol to the origin server, but somebody say something if I've overlooked that capability.
² I would assert that the Proxy protocol "header" is not properly called a header, given what that implies. It is a preamble, not a header, though it was unfortunately called a "header" in the standard. It's most certainly not an HTTP header.
If you upgrade Varnish to 5.0 it can send PROXY Protocol version 1 to NGINX by setting ".proxy_header = 1"

Pros and cons of using a Http proxy v/s https proxy?

The JVM allows proxy properties http.proxyHost and http.proxyPort for specifying a HTTP proxy server and https.proxyHost and https.proxyPort for specifying a HTTPS proxy server .
I was wondering whether there are any advantages of using a HTTPS proxy server compared to a HTTP proxy server ?
Is accessing a https url via a HTTPS proxy less cumbersome than accesing it from a HTTP proxy ?
HTTP proxy gets a plain-text request and [in most but not all cases] sends a different HTTP request to the remote server, then returns information to the client.
HTTPS proxy is a relayer, which receives special HTTP request (CONNECT verb) and builds an opaque tunnel to the destination server (which is not necessarily even an HTTPS server). Then the client sends SSL/TLS request to the server and they continue with SSL handshake and then with HTTPS (if requested).
As you see, these are two completely different proxy types with different behavior and different design goals. HTTPS proxy can't cache anything as it doesn't see the request sent to the server. With HTTPS proxy you have a channel to the server and the client receives and validates server's certificate (and optionally vice versa). HTTP proxy, on the other hand, sees and has control over the request it received from the client.
While HTTPS request can be sent via HTTP proxy, this is almost never done because in this scenario the proxy will validate server's certificate, but the client will be able to receive and validate only proxy's certificate, and as name in the proxy's certificate will not match the address the socket connected to, in most cases an alert will be given and SSL handshake won't succeed (I am not going into details of how to try to address this).
Finally, as HTTP proxy can look into the request, this invalidates the idea of security provided by HTTPS channel, so using HTTP proxy for HTTPS requests is normally done only for debugging purposes (again we omit cases of paranoid company security policies which require monitoring of all HtTPS traffic of company employees).
Addition: also read my answer on the similar topic here.
There are no pros or cons.
And there are no "HTTPS proxy" server.
You can tell the protocol handlers which proxy server to use for different protocols. This can be done for http, https, ftp and socks. Not more and not less.
I can't tell you if you should use a different proxy for https connections or not. It depends.
I can only explain the difference of an http and https request to a proxy.
Since the HTTP Proxy (or web proxy) understands HTTP (hence the name), the client can just send the request to the proxy server instead of the actual destenation.
This does not work for HTTPS.
This is because the proxy can't make the TLS handshake, which happens at first.
Therefore the client must send a CONNECT request to the proxy.
The proxy establishes a TCP connection and just sends the packages forth and back without touching them.
So the TLS handshake happens between the client and destenation.
The HTTP proxy server does not see everything and does not validate destenation servers certificate whatsoever.
There can be some confusion with this whole http, https, proxy thing.
It is possible to connect to a HTTP proxy with https.
In this case, the communication between the client and the proxy is encrypted.
There are also so called TLS terminating or interception proxy servers like Squid's SSL Peek and Splice or burp, which see everything.
But this should not work out of the box, because the proxy uses own certificates which are not signed by trusted CAs.
References
https://docs.oracle.com/javase/8/docs/technotes/guides/net/proxies.html
https://parsiya.net/blog/2016-07-28-thick-client-proxying---part-6-how-https-proxies-work/
http://dev.chromium.org/developers/design-documents/secure-web-proxy
https://www.rfc-editor.org/rfc/rfc2817#section-5
https://www.rfc-editor.org/rfc/rfc7231#section-4.3.6
If you mean connecting to a HTTP proxy server over TLS by saying HTTPS proxy, then
I was wondering whether there are any advantages of using a HTTPS
proxy server compared to a HTTP proxy server ?
The advantage is that your client's connection to proxy server is encrypted. E.g. A firewall can't not see which host you use CONNECT method connect to.
Is accessing a https url via a HTTPS proxy less cumbersome than
accesing it from a HTTP proxy ?
Everything is the same except that with HTTPS proxy, brower to proxy server connection is encrypted.
But you need to deploy a certificate on your proxy server, like how a https website does, and use a pac file to configure the brower to enable Connecting to a proxy over SSL.
For more details and a practical example, check my question and answer here HTTPs proxy server only works in SwitchOmega
Unfortunately, "HTTPS proxy" has two distinct meanings:
A proxy that can forward HTTPS traffic to the destination. This proxy itself is using an HTTP protocol to set up the forwarding.
In case the browser is trying to connect to a website using HTTPS, the browser will send a CONNECT request to the proxy, and the proxy will set up a TCP connection with the website and mirror all TCP traffic sent on the connection from the browser to the proxy onto the connection between the proxy and the website, and similarly mirror the response TCP packet payload from the webite to the connection with the browser. Hypothetically, the same mechanism using CONNECT could be used with HTTP traffic, but practically speaking browsers don't do that. For HTTP traffic, they send the actual HTTP request to the proxy, including the full path in the HTTP command (as well as setting the Host header): https://stackoverflow.com/a/38259076/10026
So, by this definition, HTTPS Proxy is a proxy that understands the CONNECT directive and can support HTTPS traffic going between the browser and the website.
A proxy that uses HTTPS protocol to secure client communication.
In this mode (sometimes referred to as "Secure Proxy"), the browser uses the proxy's own certificate to perform TLS handshake with the proxy, and then sends either HTTP or HTTPS traffic, (including CONNECT requests), on that connection as per (1). So, the connection between the browser and the proxy is always protected with a TLS key derived using the proxy's certificate, regardless of whether the traffic itself is encrypted with a key negotiated between the browser and the website. If HTTPS traffic is proxied via a secure proxy, it is double-encrypted on the connection between the browser and the proxy.
For example, the Proxy Switcher Chrome plugin has two separate settings to control each of these funtionalities:
As of 2022, the option to use a secure proxy is not available in MacOS and Windows manual proxy configuration UI. But a secure proxy may be specified in a PAC file used in automatic proxy configuration using the HTTPS proxy directive. It is up to the consuming application to support the HTTPS directive; most major browsers, except Safari, and many desktop apps support it.
NOTE: Things get a bit more complicated because some proxies that proxy HTTPS traffic don't simply forward TCP packet payload, as described in (1), but act as Intercepting Proxies. Using a spoofed website certificate, they effectively perform a Man-in-the-Middle attack (well, it's not necessarily an attack because it's expected behavior). Whereas the browser thinks it's using the website's certificate to set up a TLS tunnel with a website, it's actually using a spoofed certificate to set up TLS tunnel with the proxy, and the proxy sets up the TLS tunnel with the website. Then proxy has visibility into the HTTPS requests/responses. But all of that is completely orthogonal to whether the proxy is acting as a secure proxy as per (2).

Is it possible to forward NON-http connecting request to some other port in nginx?

I have nginx running on my server, listening port 80 and 433. I know nginx has a number ways of port forwarding that allows me to forward request like: http://myserver:80/subdir1 to some address like: http://myserver:8888.
My question is it possible to configure nginx so that i can forward NON-http request (just those plain TCP connection) to some other port? It's very easy to test if it's a http request because the first bytes will be either "GET" or "POST". Here's the example.
The client connected to nginx .
The client send:
a. HTTP get request: "GET / HTTP 1.1": some rule for HTTP
b. Any bytes that can't be recognized as HTTP header: forward it to some other port, say, 888, 999, etc.
Is it technically possible? Or would you suggest a way to do this?
It is possible since nginx 1.9.0:
http://nginx.org/en/docs/stream/ngx_stream_core_module.html
Something along these lines (this goes on top level of nginx.conf):
stream {
upstream backend {
server backend1.example.com:12345;
}
server {
listen 12345;
proxy_pass backend;
}
}
This is technically possible for sure.
You can modify open source tcp proxies like nginx module called nginx_tcp_proxy_module or HAproxy.
Or you can write a nginx module similar to above one to do this for you.
if nginx remote proxying with HTTP, your client could use the HTTP CONNECT command, then it connects with the remote port and forwards all data as "raw" (or at least I think so).

Resources