Get client's real IP address on Heroku - http

On any Heroku stack, I want to get the client's IP. my first attempt might be:
request.headers['REMOTE_ADDR']
This does not work, of course, because all requests are passed through proxies. So the alternative was to use:
request.headers['X-Forwarded-For']
But this is not quite safe, is it?
If it contains only one value, I take this. If it contains more than one value (comma-separated), I could take the first one.
But what if someone manipulates this value? I cannot trust request.headers['X-Forwarded-For'] as I could with request.headers['REMOTE_ADDR']. And there is no list of trusted proxies that I could use, either.
But there must be some way to reliably get the client's IP address, always. Do you know one?
In their docs, Heroku describes that X-Forwarded-For is "the originating IP address of the client connecting to the Heroku router".
This sounds as if Heroku could be overwriting the X-Forwarded-For with the originating remote IP. This would prevent spoofing, right? Can someone verify this?

From Jacob, Heroku's Director of Security at the time:
The router doesn't overwrite X-Forwarded-For, but it does guarantee that the real origin will always be the last item in the list.
This means that, if you access a Heroku app in the normal way, you will just see your IP address in the X-Forwarded-For header:
$ curl http://httpbin.org/ip
{
"origin": "123.124.125.126",
}
If you try to spoof the IP, your alleged origin is reflected, but - critically - so is your real IP. Obviously, this is all we need, so there's a clear and secure solution for getting the client's IP address on Heroku:
$ curl -H"X-Forwarded-For: 8.8.8.8" http://httpbin.org/ip
{
"origin": "8.8.8.8, 123.124.125.126"
}
This is just the opposite of what is described on Wikipedia, by the way.
PHP implementation:
function getIpAddress() {
if (isset($_SERVER['HTTP_X_FORWARDED_FOR'])) {
$ipAddresses = explode(',', $_SERVER['HTTP_X_FORWARDED_FOR']);
return trim(end($ipAddresses));
}
else {
return $_SERVER['REMOTE_ADDR'];
}
}

I work in Heroku's support department and have spent some time discussing this with our routing engineers. I wanted to post some additional information to clarify some things about what's going on here.
The example provided in the answer above just had the client IP displayed last coincidentally and that's not really guaranteed. The reason it wasn't first is because the originating request claimed that it was forwarding for the IP specified in the X-Forwarded-For header. When the Heroku router received the request, it just appended the IP that was directly connecting to the X-Forwarded-For list after the one that had been injected into the request. Our router always adds the IP that connected to the AWS ELB in front of our platform as the last IP in the list. This IP could be the original one (and in the case where there's only one IP, it almost certainly is), but the instant there are multiple IPs chained, all bets are off. Convention is always to add the latest IP in the chain to the end of the list (which is what we do), but at any point along the chain that chain can be altered and different IPs could be inserted. As such, the only IP that's reliable (from the perspective of our platform) is the last IP in the list.
To illustrate, let's say someone initiates a request and arbitrarily adds 3 additional IPs to the X-Forwarded-For header:
curl -H "X-Forwarded-For: 12.12.12.12,15.15.15.15,4.4.4.4" http://www.google.com
Imagine this machine's IP was 9.9.9.9 and that it had to pass through a proxy (e.g., a university's campus-wide proxy). Let's say that proxy had an IP of 2.2.2.2. Assuming it wasn't configured to strip X-Forwarded-For headers (which it likely wouldn't be), it would just tack the 9.9.9.9 IP to the end of the list and pass the request on to Google. At this point, the header would look like this:
X-Forwarded-For: 12.12.12.12,15.15.15.15,4.4.4.4,9.9.9.9
That request will then pass through Google's endpoint, which will append the university proxy's IP of 2.2.2.2, so the header will finally look like this in Google's logs:
X-Forwarded-For: 12.12.12.12,15.15.15.15,4.4.4.4,9.9.9.9,2.2.2.2
So, which is the client IP? It's impossible to say from Google's standpoint. In reality, the client IP is 9.9.9.9. The last IP listed is 2.2.2.2 though and the first is 12.12.12.12. All Google would know is that the 2.2.2.2 IP is definitely correct because that was the IP that actually connected to their service – but they wouldn't know if that was the initial client for the request or not from the data available. In the same way, when there's just one IP in this header – that is the IP that directly connected to our service, so we know it's reliable.
From a practical standpoint, this IP will likely be reliable most of the time (because most people won't be bothering to spoof their IP). Unfortunately, it's impossible to prevent this sort of spoofing and by the time a request gets to the Heroku router, it's impossible for us to tell if IPs in an X-Forwarded-For chain have been tampered with or not.
All reliability issues aside, these IP chains should always be read from left-to-right. The client IP should always be the left-most IP.

You can never really trust any information coming from the client. It's more of a question of who do you trust and how do you verify it. Even Heroku can possibly be influenced to provide a bad HTTP_X_FORWARDED_FOR value if they have a bug in their code, or they get hacked somehow. Another option would be some other Heroku machine connecting to your server internally and bypassing their proxy altogether while faking REMOTE_ADDR and/or HTTP_X_FORWARDED_FOR.
The best answer here would depend on what you're trying to do. If you're trying to verify your clients, a client-side certificate might be a more appropriate solution. If all you need the IP for is geo-location, trusting the input might be good enough. Worst case, someone will fake the location and get the wrong content... If you have a different use case, there are many other solutions in between those two extremes.

If I make a request with multiple X-Forwarded-For headers: curl -s -v -H "X-Forwarded-For: 1.1.1.1, 1.1.1.2, 1.1.1.3" -H "X-Forwarded-For: 2.2.2.2" -H "X-Forwarded-For: 3.3.3.3" https://foo.herokuapp.com/
> X-Forwarded-For: 1.1.1.1, 1.1.1.2, 1.1.1.3
> X-Forwarded-For: 2.2.2.2
> X-Forwarded-For: 3.3.3.3
The X-Forwarded-For header passed along to the app will be:
1.1.1.1, 1.1.1.2, 1.1.1.3, <real client IP>, 2.2.2.2, 3.3.3.3
so picking the last from that list does not hold up :/

Related

Faking an HTTP request header

I have a general networking question but it's related with security aspect.
Here is my case: I have a host which is infected by a malware. The malware creates an http packet to communicate with it's command and control server. While constructing the packet, the IP layer contains the correct IP address of the command and control server. The tcp layer contains the correct port number 80.
Before sending the packet out, the malware modifies the http header to replace the host header with “google.com" instead of it's server address. It then attaches the stolen data with the packet and sends it out.
My understanding is that the packet will get delivered to the correct server because the routing will happen based on the IP.
But can I host a webserver on this IP that would receive all packets with header host google.com and parse it correctly?
Based on my reading on the internet, it is possible but if it is that easy then why have malware authors not adopted this technique to spoof the http headers and bypass traditional domain whitelisting engines.
When you make a request to let's say Apache2 server, what actually Apache does is match your "Host" header with any VirtualHost within server's configuration. Only if it cannot be found / is invalid, Apache will route the request to default virtualhost if it's defined. Basically nothing stops you from changing these headers.
You can simply test it by editing your hosts file and pointing google.com to any other IP - you will be able to handle the google.com domain on your server, but only you will be to use it this way - no one else.
Anything you send inside HTTP headers shouldn't be trusted - it just a guide for your server on how to actually handle the traffic.
The fake host header is just there to trick some deep-inspection firewalls ("it's for Google? you may pass..."). The server on that IP either doesn't care about the host header (default vhost) or is explicitly configured to accept it.
Passing the loot on by using fake headers or just as plain data behind the headers is another trick to fool data loss prevention.
These methods can mislead shallow application-layer inspection but won't pass a decent firewall.

Should x-forwarded-for contain a proxy in https traffic?

I have a web server cluster behind a proxy/load balancer. That proxy contains my SSL certs and hands the web servers the decrypted traffic, and along the way adds an "x-forwarded-for" header into the HTTP header the web application receives. This application has seen millions of IP addresses over the past decade, but something weird happened today.
For the first time, I saw an x-forwarded-for that contained a second address reach the application [addressed altered]:
x-forwarded-for: 62.211.19.218, 177.168.159.85
This indicates that the traffic came through a proxy, and I understand this is normal for x-f-f. I would have thought this was impossible (or at least unlikely) with https as the protocol.
Can someone explain how this is legit?
As per RFC 7239, this HTTP header is specified as
X-Forwarded-For: client, proxy1, proxy2, ...
Where client is the IP of the original client and then each proxy adds the IP it received the request from, at the end of the list. In the above example, you would see IP of proxy3 in your webserver and proxy2 is the IP which connected to the proxy3.
As anyone can put anything inside this header, you should accept it only from known sources like your own reverse proxy or whitelist of known legit proxies. For example Apache has mod_rpaf, which transparently changes client IP address to the one provided in this header, but only if the request is received from the IP of known proxy server.
On corporate networks you can easily do transparent proxying for HTTPS traffic without any notice from normal users. Just create your own certification authority, use for example Windows Group Policy to install & trust this CA on all corporate workstations. Then redirect all HTTPS connections to your proxy which will generate certificate for all visited domains on the fly. This is something which is happening and you can even buy enterprise hardware proxies using this method.
So to summarize the reasons why you could see multiple IPs in the X-Forwarded-For header:
Transparent HTTPS proxy as mentioned above
The header was added by the requestor itself (browser, wget, script) for whatever reason, for example to hide its own IP
Some CDN like Cloudflare could add that header if used
Multiple reverse proxies defined either intentionally or by mistake
Conclusion: You should only trust this header if it originates from your own proxy (in case of multiple IPs, trust only the last one).
MAYBE it's using the Proxy protocol for HTTPS. Granted you may not be using httproxy, but this seems to be a decent description:
http://www.haproxy.org/download/1.5/doc/proxy-protocol.txt
I'm not sure about the SSL cert, but there's no guarantee someone is doing something pathalogical (maybe unintentionally) like running all their HTTPS traffic through a proxy and then accepting all the invalid certificates. But I suspect the proxy protocol might make this work; it does expose the HTTP headers to the proxy in some sense.

Get real client IP in OpenShift?

I tried an application, and I used a way to ban those who send more than 5 empty requests to the server, but the problem then, is that everybody got blocked, and this is because everybody was seen as a ONE UNIQUE IP.
In the code, I used the way to get the X-Real-IP but it doesent work on OpenShift, so how to do that then?
Here is how I get the IP:
x_real_ip = self.request.headers.get("X-Real-IP")
remote_ip = self.request.remote_ip if not x_real_ip else x_real_ip
Update: I get '127.3.165.129', None) when doing print(self.request.remote_ip, x_real_ip)
You want to look for the "x-forwarded-for" header to get the visitors ip address. What you are seeing is the ip address of the reverse proxy that users go through before ending up at your application/gear.
You can refer to this article in the Developer Center for more information about how requests are routed on OpenShift: https://developers.openshift.com/en/managing-port-binding-routing.html

How does Proxifier resolve hostnames through proxy?

What technique is Proxifier using to resolve hostnames through proxy? All other solutions I've found on the Internet offers DNS through socks just like what Badvpn/Tun2Socks does. But Proxifier can work even through a http proxy and the only thing you need is that your proxy server support DNS (Squid for example). Their explanation is very brief saying "Proxifier has to assign placeholder (fake) IP addresses". But what that mean exactly?
Note: As you know DNS queries are UDP by default and can't be forwarded through http proxy naturally. Browsers are another examples which do name resolution through proxy when are set to use it.
Assigning fake IP in response to DNS request is different from returning truncated DNS answer.
Here's the related RFC about assigning fake IPs in response to DNS request in order to get the domain name and pass it to the remote proxy server afterward: https://www.rfc-editor.org/rfc/rfc3089
I found the answer. Redsocks has implemented this indeed, as it says:
Redsocks includes `dnstc' that is fake and really dumb DNS server that
returns "truncated answer" to every query via UDP. RFC-compliant
resolver should repeat same query via TCP in this case - so the
request can be redirected using usual redsocks facilities.

JBoss Netty getting User IP (Http Request)

I need to log the user ip ady's for every request to our JBoss Netty server. I thought:
MessageEvent e;
e.getChannel().getRemoteAddress();
was the correct answer, but this always returns 127.0.0.1 and I need the actual client ip. Coming from Rails I checked how they find out the ip, from the docu:
Determines originating IP address.
REMOTE_ADDR is the standard but will
fail if the user is behind a proxy.
HTTP_CLIENT_IP and/or
HTTP_X_FORWARDED_FOR are set by
proxies so check for these if
REMOTE_ADDR is a proxy.
HTTP_X_FORWARDED_FOR may be a comma-
delimited list in the case of multiple
chained proxies; the last address
which is not trusted is the
originating IP.
So should I check for all the headers in Netty or is there an easier way?
Ok I have the answer. Using ChannelHandlerContext instead of MessageEvent does the trick.
SocketAddress remoteAddress = ctx.getChannel().getRemoteAddress();

Resources