Web infrastructure should reset "x-forwarded-*" headers?

Web infrastructure should reset "x-forwarded-*" headers? - http

Given a web infrastructure, with:
WEB --> HTTP PROXY --> NGINX --> PHP
We use "X-forwarded-{for,host} from Php to get the real IP and host. But sometime, we got wrong values, I discover that it's possible to set these fields like this:
curl -sL http://www.toto.com -H "X-Forwarded-For: 1.96.0.1"
As a developer, I dont have access to haproxy/nginx configuration, and our infra. guys are saying that is the developer fault.
Hence my question, the front HTTP proxy should remove user "X-forwarded" fields before proxing query to app server?

The mistake is assuming that the leftmost value in X-Forwarded-For is the client address. That is not how it is intended to work.
X-Forwarded-For works by appending the address from the incoming connection onto the right of the incoming header value, if present.
The correct solution is for the application to begin reading from the right, eliminating trusted addresses as it moves to the left, then stopping.
The rightmost untrusted address is the address you are looking for.
X-Forwarded-For: 192.168.1.5, 203.0.113.5, 10.0.0.5
Let's say 10.0.0.5 is the proxy. Nginx added this address, because the connection came from the proxy. You know this address and trust it to have always appended the address of the client it saw, so you remove it from the list. It is not the browser. You know, by virtue of the trust you have of this machine, that the next address to the left is true and correct and was added by the proxy.
After you remove 10.0.0.5, you find that 203.0.113.5 is now the rightmost address. Because you just removed a trusted address, you know that 203.0.113.5 is correct, but you do not recognize and thus cannot trust any further information to the left of this address. If 203.0.113.5 is a web proxy, belonging to an external organization, then the next address to the left, 192.168.1.5, might be the address of the browser behind that device, or it might be forged. Ultimately, it doesn't matter, because the rightmost untrusted address, 203.0.113.5 is the address you want. This is the address of the external machine that connected to your stack.
You may want to log or otherwise preserve the values further to the left, but you can't make any decisions based on anything but the rightmost remaining address, after removing any trusted address, because that is the only one that you can believe.
Importantly, you must stop further processing at the rightmost untrusted device address, because if a trusted address appears to the left of this point, then it is forged.

Related

Faking an HTTP request header

I have a general networking question but it's related with security aspect.
Here is my case: I have a host which is infected by a malware. The malware creates an http packet to communicate with it's command and control server. While constructing the packet, the IP layer contains the correct IP address of the command and control server. The tcp layer contains the correct port number 80.
Before sending the packet out, the malware modifies the http header to replace the host header with “google.com" instead of it's server address. It then attaches the stolen data with the packet and sends it out.
My understanding is that the packet will get delivered to the correct server because the routing will happen based on the IP.
But can I host a webserver on this IP that would receive all packets with header host google.com and parse it correctly?
Based on my reading on the internet, it is possible but if it is that easy then why have malware authors not adopted this technique to spoof the http headers and bypass traditional domain whitelisting engines.

When you make a request to let's say Apache2 server, what actually Apache does is match your "Host" header with any VirtualHost within server's configuration. Only if it cannot be found / is invalid, Apache will route the request to default virtualhost if it's defined. Basically nothing stops you from changing these headers.
You can simply test it by editing your hosts file and pointing google.com to any other IP - you will be able to handle the google.com domain on your server, but only you will be to use it this way - no one else.
Anything you send inside HTTP headers shouldn't be trusted - it just a guide for your server on how to actually handle the traffic.

The fake host header is just there to trick some deep-inspection firewalls ("it's for Google? you may pass..."). The server on that IP either doesn't care about the host header (default vhost) or is explicitly configured to accept it.
Passing the loot on by using fake headers or just as plain data behind the headers is another trick to fool data loss prevention.
These methods can mislead shallow application-layer inspection but won't pass a decent firewall.

Get real client IP in OpenShift?

I tried an application, and I used a way to ban those who send more than 5 empty requests to the server, but the problem then, is that everybody got blocked, and this is because everybody was seen as a ONE UNIQUE IP.
In the code, I used the way to get the X-Real-IP but it doesent work on OpenShift, so how to do that then?
Here is how I get the IP:
x_real_ip = self.request.headers.get("X-Real-IP")
remote_ip = self.request.remote_ip if not x_real_ip else x_real_ip
Update: I get '127.3.165.129', None) when doing print(self.request.remote_ip, x_real_ip)

You want to look for the "x-forwarded-for" header to get the visitors ip address. What you are seeing is the ip address of the reverse proxy that users go through before ending up at your application/gear.
You can refer to this article in the Developer Center for more information about how requests are routed on OpenShift: https://developers.openshift.com/en/managing-port-binding-routing.html

Get client's real IP address on Heroku

On any Heroku stack, I want to get the client's IP. my first attempt might be:
request.headers['REMOTE_ADDR']
This does not work, of course, because all requests are passed through proxies. So the alternative was to use:
request.headers['X-Forwarded-For']
But this is not quite safe, is it?
If it contains only one value, I take this. If it contains more than one value (comma-separated), I could take the first one.
But what if someone manipulates this value? I cannot trust request.headers['X-Forwarded-For'] as I could with request.headers['REMOTE_ADDR']. And there is no list of trusted proxies that I could use, either.
But there must be some way to reliably get the client's IP address, always. Do you know one?
In their docs, Heroku describes that X-Forwarded-For is "the originating IP address of the client connecting to the Heroku router".
This sounds as if Heroku could be overwriting the X-Forwarded-For with the originating remote IP. This would prevent spoofing, right? Can someone verify this?

From Jacob, Heroku's Director of Security at the time:
The router doesn't overwrite X-Forwarded-For, but it does guarantee that the real origin will always be the last item in the list.
This means that, if you access a Heroku app in the normal way, you will just see your IP address in the X-Forwarded-For header:
$ curl http://httpbin.org/ip
{
"origin": "123.124.125.126",
}
If you try to spoof the IP, your alleged origin is reflected, but - critically - so is your real IP. Obviously, this is all we need, so there's a clear and secure solution for getting the client's IP address on Heroku:
$ curl -H"X-Forwarded-For: 8.8.8.8" http://httpbin.org/ip
{
"origin": "8.8.8.8, 123.124.125.126"
}
This is just the opposite of what is described on Wikipedia, by the way.
PHP implementation:
function getIpAddress() {
if (isset($_SERVER['HTTP_X_FORWARDED_FOR'])) {
$ipAddresses = explode(',', $_SERVER['HTTP_X_FORWARDED_FOR']);
return trim(end($ipAddresses));
}
else {
return $_SERVER['REMOTE_ADDR'];
}
}

I work in Heroku's support department and have spent some time discussing this with our routing engineers. I wanted to post some additional information to clarify some things about what's going on here.
The example provided in the answer above just had the client IP displayed last coincidentally and that's not really guaranteed. The reason it wasn't first is because the originating request claimed that it was forwarding for the IP specified in the X-Forwarded-For header. When the Heroku router received the request, it just appended the IP that was directly connecting to the X-Forwarded-For list after the one that had been injected into the request. Our router always adds the IP that connected to the AWS ELB in front of our platform as the last IP in the list. This IP could be the original one (and in the case where there's only one IP, it almost certainly is), but the instant there are multiple IPs chained, all bets are off. Convention is always to add the latest IP in the chain to the end of the list (which is what we do), but at any point along the chain that chain can be altered and different IPs could be inserted. As such, the only IP that's reliable (from the perspective of our platform) is the last IP in the list.
To illustrate, let's say someone initiates a request and arbitrarily adds 3 additional IPs to the X-Forwarded-For header:
curl -H "X-Forwarded-For: 12.12.12.12,15.15.15.15,4.4.4.4" http://www.google.com
Imagine this machine's IP was 9.9.9.9 and that it had to pass through a proxy (e.g., a university's campus-wide proxy). Let's say that proxy had an IP of 2.2.2.2. Assuming it wasn't configured to strip X-Forwarded-For headers (which it likely wouldn't be), it would just tack the 9.9.9.9 IP to the end of the list and pass the request on to Google. At this point, the header would look like this:
X-Forwarded-For: 12.12.12.12,15.15.15.15,4.4.4.4,9.9.9.9
That request will then pass through Google's endpoint, which will append the university proxy's IP of 2.2.2.2, so the header will finally look like this in Google's logs:
X-Forwarded-For: 12.12.12.12,15.15.15.15,4.4.4.4,9.9.9.9,2.2.2.2
So, which is the client IP? It's impossible to say from Google's standpoint. In reality, the client IP is 9.9.9.9. The last IP listed is 2.2.2.2 though and the first is 12.12.12.12. All Google would know is that the 2.2.2.2 IP is definitely correct because that was the IP that actually connected to their service – but they wouldn't know if that was the initial client for the request or not from the data available. In the same way, when there's just one IP in this header – that is the IP that directly connected to our service, so we know it's reliable.
From a practical standpoint, this IP will likely be reliable most of the time (because most people won't be bothering to spoof their IP). Unfortunately, it's impossible to prevent this sort of spoofing and by the time a request gets to the Heroku router, it's impossible for us to tell if IPs in an X-Forwarded-For chain have been tampered with or not.
All reliability issues aside, these IP chains should always be read from left-to-right. The client IP should always be the left-most IP.

You can never really trust any information coming from the client. It's more of a question of who do you trust and how do you verify it. Even Heroku can possibly be influenced to provide a bad HTTP_X_FORWARDED_FOR value if they have a bug in their code, or they get hacked somehow. Another option would be some other Heroku machine connecting to your server internally and bypassing their proxy altogether while faking REMOTE_ADDR and/or HTTP_X_FORWARDED_FOR.
The best answer here would depend on what you're trying to do. If you're trying to verify your clients, a client-side certificate might be a more appropriate solution. If all you need the IP for is geo-location, trusting the input might be good enough. Worst case, someone will fake the location and get the wrong content... If you have a different use case, there are many other solutions in between those two extremes.

If I make a request with multiple X-Forwarded-For headers: curl -s -v -H "X-Forwarded-For: 1.1.1.1, 1.1.1.2, 1.1.1.3" -H "X-Forwarded-For: 2.2.2.2" -H "X-Forwarded-For: 3.3.3.3" https://foo.herokuapp.com/
> X-Forwarded-For: 1.1.1.1, 1.1.1.2, 1.1.1.3
> X-Forwarded-For: 2.2.2.2
> X-Forwarded-For: 3.3.3.3
The X-Forwarded-For header passed along to the app will be:
1.1.1.1, 1.1.1.2, 1.1.1.3, <real client IP>, 2.2.2.2, 3.3.3.3
so picking the last from that list does not hold up :/

Get domain the server was reached over?

In general on any non-HTTP server. Would there be a way to detect what domain was used to reach the IP?
I know HTTP servers get the domain passed within the request header, but would this be possible with any other server that does not require this information to be received from the client?
I'm especially looking for a way to do this with the minecraft server (Bukkit) so my preferred language (if needed for you to answer) would be Java. But I'd like to not have the theories about this language specific.

In general, no, which is why the HTTP protocol includes it in the headers.
In order to reach your server, first a DNS lookup is performed to resolve your IP, which is then followed by the connection itself. These two steps are separate, and hard to link together.
Logging what domain was last requested by a client is tricky, too, as DNS information is often cached, so the DNS request may not even reach your DNS server before being answered.
If it isn't cached, it also often isn't directly looked up by the end client, but rather by a caching DNS server operated, for instance, by the ISP.

No. The only way to get the DNS name used to connect to a server is to have the client provide it.

No, if there are no means for this in the protocol itself like the Host header in HTTP you cannot find out which hostname was used on the client to resolve your IP address.

HTTP Protocol Working

I need to ask a question about HTTP protocol. I am trying to develop a sandbox (web browser) where any one can surf the website with different identities. Different identity means that on each request to a page will be from different IP address.
Now I don't know how scripts on web servers check the IP address of the one who generated the request. This is possible and I am aware of this. But I need to know whether this is HTTP request header that has the IP address or something else.
Simply speaking, I want to fool the websites. :)
Umair

Uh, the IP address is provided EVERY time you connect to ANYTHING. It has nothing to do with http headers.
See IPv4 -> packet structure -> header

You need to read up on the layers that build up a network from the wires[1] to the application. I think you'll find the the IP address is known long before HTTP gets involved.
See http://en.wikipedia.org/wiki/OSI_model
[1] or photons, or radio waves, or smoke signals...

Develop Reference

r css asp.net wordpress firebase qt symfony nginx http apache-flex