I would like to know how internal communication links between private internal servers and a reverse proxy look.
When from my client (browser) I make a request to, say https://facebook.com, I hit Facebook's reverse proxy. I have two questions, when that reverse proxy gets a request and needs to forward it to the server that should handle it, does that sever it is forwarding the request to have a domain name or is it just an IP address ((user.facebook.com or useroffacebook.com v.s. 34.23.66.25 (DO NOT GO TO THAT ADDRESS I JUST MADE IT UP!!!)))? Also, does that connection use HTTP or HTTPS?
Like Kshitij Joshi already mentioned, it could be both.
A more detailed perspective for implementation:
reverse proxy should use IP addresses for routing so they are still working even if the DNS fails or is unavailable to the proxy for some reason.
internal traffic should also be encrypted (HTTPS). using plain text, even in internal networks, must be considered dangerous and is not recommended.
from my mindset you can replace the 'should' with a 'must'.
Related
I have a general networking question but it's related with security aspect.
Here is my case: I have a host which is infected by a malware. The malware creates an http packet to communicate with it's command and control server. While constructing the packet, the IP layer contains the correct IP address of the command and control server. The tcp layer contains the correct port number 80.
Before sending the packet out, the malware modifies the http header to replace the host header with “google.com" instead of it's server address. It then attaches the stolen data with the packet and sends it out.
My understanding is that the packet will get delivered to the correct server because the routing will happen based on the IP.
But can I host a webserver on this IP that would receive all packets with header host google.com and parse it correctly?
Based on my reading on the internet, it is possible but if it is that easy then why have malware authors not adopted this technique to spoof the http headers and bypass traditional domain whitelisting engines.
When you make a request to let's say Apache2 server, what actually Apache does is match your "Host" header with any VirtualHost within server's configuration. Only if it cannot be found / is invalid, Apache will route the request to default virtualhost if it's defined. Basically nothing stops you from changing these headers.
You can simply test it by editing your hosts file and pointing google.com to any other IP - you will be able to handle the google.com domain on your server, but only you will be to use it this way - no one else.
Anything you send inside HTTP headers shouldn't be trusted - it just a guide for your server on how to actually handle the traffic.
The fake host header is just there to trick some deep-inspection firewalls ("it's for Google? you may pass..."). The server on that IP either doesn't care about the host header (default vhost) or is explicitly configured to accept it.
Passing the loot on by using fake headers or just as plain data behind the headers is another trick to fool data loss prevention.
These methods can mislead shallow application-layer inspection but won't pass a decent firewall.
The question is about HTTP vs HTTPS.
If I want to anonymously load a website that forces HTTPS, like Google.com, do I need an HTTPS proxies, or can I get away with HTTP proxies?
If your proxy is SOCKS it will not care what kind of socket is connecting through it. It has its own handshake and it does not care about what happens after the handshake. Whether after the SOCKS handshake an SSL handshake (HTTPS) is started it is not a SOCKS proxy problem, it will just pass through.
Several HTTP proxies on the other hand expect HTTP headers to guide them, such a HTTP proxy will not allow HTTPS since it needs to read the headers.
On the third hand (ekhm... well, foot?), an HTTP proxy that supports HTTP CONNECT can also setup the transfer of arbitrary data. Therefore such a proxy can setup any type of socket, which can have an SSL handshake, which can then be used for HTTPS transfer.
HTTP Proxy Server supports CONNECT verb which supports HTTPS connections within HTTP Proxy. You don't need special HTTPS proxy server or any other setup.
CONNECT verb allows you to create binary socket tunnel to any given IP:Port address. So any HTTP client (all browsers), will open secure tunnel and communicate securely over proxy server. However, no one cant control or see anything that is going through the tunnel unless they implement man in middle attack by sending you self-signed certificates.
Most firewall these days automatically implement man in middle self signed certificates that are deployed in work network, so you have to probably dig more to identify whether it is really secure or not. So it may not be that anonymous.
If you're trying to access a service anonymously, you won't get this by running your own proxy. It's not clear from the original question what is meant by "proxy", e.g. local service, or remote service. You won't get anonymity by surfing through a proxy that's on your network, unless it's something like a TOR proxy which relays out through the TOR network.
As for whether proxies can support HTTPS or not, that's been covered here, it would be unusual to find a proxy that doesn't support CONNECT. However if it's a remote anonymizing service you're using, I doubt they would do MitM, since you'd need to install the signing cert into your trusted root store, so they couldn't do that surreptitiously.
I have a web server cluster behind a proxy/load balancer. That proxy contains my SSL certs and hands the web servers the decrypted traffic, and along the way adds an "x-forwarded-for" header into the HTTP header the web application receives. This application has seen millions of IP addresses over the past decade, but something weird happened today.
For the first time, I saw an x-forwarded-for that contained a second address reach the application [addressed altered]:
x-forwarded-for: 62.211.19.218, 177.168.159.85
This indicates that the traffic came through a proxy, and I understand this is normal for x-f-f. I would have thought this was impossible (or at least unlikely) with https as the protocol.
Can someone explain how this is legit?
As per RFC 7239, this HTTP header is specified as
X-Forwarded-For: client, proxy1, proxy2, ...
Where client is the IP of the original client and then each proxy adds the IP it received the request from, at the end of the list. In the above example, you would see IP of proxy3 in your webserver and proxy2 is the IP which connected to the proxy3.
As anyone can put anything inside this header, you should accept it only from known sources like your own reverse proxy or whitelist of known legit proxies. For example Apache has mod_rpaf, which transparently changes client IP address to the one provided in this header, but only if the request is received from the IP of known proxy server.
On corporate networks you can easily do transparent proxying for HTTPS traffic without any notice from normal users. Just create your own certification authority, use for example Windows Group Policy to install & trust this CA on all corporate workstations. Then redirect all HTTPS connections to your proxy which will generate certificate for all visited domains on the fly. This is something which is happening and you can even buy enterprise hardware proxies using this method.
So to summarize the reasons why you could see multiple IPs in the X-Forwarded-For header:
Transparent HTTPS proxy as mentioned above
The header was added by the requestor itself (browser, wget, script) for whatever reason, for example to hide its own IP
Some CDN like Cloudflare could add that header if used
Multiple reverse proxies defined either intentionally or by mistake
Conclusion: You should only trust this header if it originates from your own proxy (in case of multiple IPs, trust only the last one).
MAYBE it's using the Proxy protocol for HTTPS. Granted you may not be using httproxy, but this seems to be a decent description:
http://www.haproxy.org/download/1.5/doc/proxy-protocol.txt
I'm not sure about the SSL cert, but there's no guarantee someone is doing something pathalogical (maybe unintentionally) like running all their HTTPS traffic through a proxy and then accepting all the invalid certificates. But I suspect the proxy protocol might make this work; it does expose the HTTP headers to the proxy in some sense.
I would like to know how "realistic" is to consider implementing an intercepting proxy(with cache support) for the purpose of web filtering. I would like to support also IPv6, authentication of clients and caching.
Reading to the list of disadvantages from squid wiki http://wiki.squid-cache.org/SquidFaq/InterceptionProxy that implements an intercepting proxy, it mentions some things to consider as disadvantages when using it(that I want to clarify):
Requires IPv4 with NAT - proxy intercepting does not support IPv6, why ?
it causes path-MTU (PMTUD) to possibly fail - why ?
Proxy authentication does not work - client thinks it's talking directly to the originating server, in there a way to do authentication in this case ?
Interception Caching only supports the HTTP protocol, not gopher, SSL, or FTP. You cannot setup a redirection-rule to the proxy server for other protocols other than HTTP since it will not know how to deal with it - This seems quite plausible as the way redirecting of traffic to proxy is done in this case is by a firewall changing the destination address of a packet from the originating server to the proxy's own address(Destination NAT). How would in this case, if i want to intercept other protocols besides http know where the connection was intended to go so I can relay it to that destination ?
Traffic may be intercepted in many ways. It does not necessarily need to use NAT (which is not supported in IPv6). A transparent interception will surely not use NAT for example (transparent in the sense that the Proxy will not generate requests with his own address but with the client address, spoofing the IP address).
PMTUD is used to detect the largest MTU size available in the path between the client and server and vise versa, it is useful for avoiding fragmentation of Ip packets on the path between the client and server. When you use a Proxy in the middle, even if the MTU is detected, it not necessarily the same as the one from the client to the proxy and from the proxy to the server. But this is not always relevant, it depends on what traffic is being served and how the proxy is behaving.
If the proxy is authenticating in the client behalf, it needs to be aware of the authentication method, and it will probably need some cookies that exist in the client. Think of it this way... If a proxy can authenticate an access to a restricted resource on your behalf, it means anyone can do it on your behalf, and the purpose of a good authentication is to protect you from such possibilities.
I guess this was a very old post from the Squid guys, but the technology exists to redirect anything you want to a specific server. One simple way to do it is by placing your server as a Default Gateway for the network, then all packets pass through it and you could redirect the packets you like to your application (or another server). And you are not limited to HTTP, BUT you are limited to the way the application protocol works.
In general on any non-HTTP server. Would there be a way to detect what domain was used to reach the IP?
I know HTTP servers get the domain passed within the request header, but would this be possible with any other server that does not require this information to be received from the client?
I'm especially looking for a way to do this with the minecraft server (Bukkit) so my preferred language (if needed for you to answer) would be Java. But I'd like to not have the theories about this language specific.
In general, no, which is why the HTTP protocol includes it in the headers.
In order to reach your server, first a DNS lookup is performed to resolve your IP, which is then followed by the connection itself. These two steps are separate, and hard to link together.
Logging what domain was last requested by a client is tricky, too, as DNS information is often cached, so the DNS request may not even reach your DNS server before being answered.
If it isn't cached, it also often isn't directly looked up by the end client, but rather by a caching DNS server operated, for instance, by the ISP.
No. The only way to get the DNS name used to connect to a server is to have the client provide it.
No, if there are no means for this in the protocol itself like the Host header in HTTP you cannot find out which hostname was used on the client to resolve your IP address.