I tried an application, and I used a way to ban those who send more than 5 empty requests to the server, but the problem then, is that everybody got blocked, and this is because everybody was seen as a ONE UNIQUE IP.
In the code, I used the way to get the X-Real-IP but it doesent work on OpenShift, so how to do that then?
Here is how I get the IP:
x_real_ip = self.request.headers.get("X-Real-IP")
remote_ip = self.request.remote_ip if not x_real_ip else x_real_ip
Update: I get '127.3.165.129', None) when doing print(self.request.remote_ip, x_real_ip)
You want to look for the "x-forwarded-for" header to get the visitors ip address. What you are seeing is the ip address of the reverse proxy that users go through before ending up at your application/gear.
You can refer to this article in the Developer Center for more information about how requests are routed on OpenShift: https://developers.openshift.com/en/managing-port-binding-routing.html
Related
Let's say I have this DNS entry: mysite.sample. I am developing, and have a copy of my website running locally in http://localhost:8080. I want this website to be reachable using the (fake) DNS: http://mysite.sample, without being forced to remember in what port this site is running. I can setup /etc/hosts and nginx to do proxing for that, but ... Is there an easier way?
Can I somehow setup a simple DNS entry using /etc/hosts and/or dnsmasq where also a non-standard port (something different than :80/:443) is specified? Without the need to provide extra configuration for nginx?
Or phrased in a simpler way: Is it possible to provide port mappings for dns entries in /etc/hosts or dnsmasq?
DNS has nothing to do with the TCP port. DNS is there to resolv names (e.g. mysite.sample) into IP addresses - kind of like a phone book.
So it's a clear "NO". However, there's another solution and I try to explain it.
When you enter http://mysite.sample:8080 in your browser URL bar, your client (e.g. browser) will first try to resolve mysite.sample (via OS calls) to an IP address. This is where DNS kicks in, as DNS is your name resolver. If that happened, the job of DNS is finished and the browser continues.
This is where the "magic" in HTTP happens. The browser is connecting to the resolved IP address and the desired port (by default 80 for http and 443 for https), is waiting for the connection to be accepted and is then sending the following headers:
GET <resource> HTTP/1.1
Host: mysite.sample:8080
Now the server reads those headers and acts accordingly. Most modern web servers have something called "virtual hosts" (i.e. Apache) or "sites" (i.e. nginx). You can configure multiple vhosts/sites - one for each domain. The web server will then provide the site matching the requested host (which is retreived by the browser from the URL bar and passed to the server via Host HTTP header). This is pure HTTP and has nothing to do with TCP.
If you can't change the port of your origin service (in your case 8080), you might want to setup a new web server in front of your service. This is also called reverse proxy. I recommend reading the NGINX Reverse Proxy docs, but you can also use Apache or any other modern web server.
For nginx, just setup a new site and redirect it to your service:
location mysite.example {
proxy_pass http://127.0.0.1:8080;
}
There is a mechanism in DNS for discovering the ports that a service uses, it is called the Service Record (SRV) which has the form
_service._proto.name. TTL class SRV priority weight port target.
However, to make use of this record you would need to have an application that referenced that record prior to making the call. As Dominique has said, this is not the way HTTP works.
I have written a previous answer that explains some of the background to this, and why HTTP isn't in the standard. (the article discusses WS, but the underlying discussion suggested adding this to the HTTP protocol directly)
Edited to add -
There was actually a draft IETF document exploring an official way to do this, but it never made it past draft stage.
This document specifies a new URI scheme called http+srv which uses a DNS SRV lookup to locate a HTTP server.
There is an specific SO answer here which points to an interesting post here
Given a web infrastructure, with:
WEB --> HTTP PROXY --> NGINX --> PHP
We use "X-forwarded-{for,host} from Php to get the real IP and host. But sometime, we got wrong values, I discover that it's possible to set these fields like this:
curl -sL http://www.toto.com -H "X-Forwarded-For: 1.96.0.1"
As a developer, I dont have access to haproxy/nginx configuration, and our infra. guys are saying that is the developer fault.
Hence my question, the front HTTP proxy should remove user "X-forwarded" fields before proxing query to app server?
The mistake is assuming that the leftmost value in X-Forwarded-For is the client address. That is not how it is intended to work.
X-Forwarded-For works by appending the address from the incoming connection onto the right of the incoming header value, if present.
The correct solution is for the application to begin reading from the right, eliminating trusted addresses as it moves to the left, then stopping.
The rightmost untrusted address is the address you are looking for.
X-Forwarded-For: 192.168.1.5, 203.0.113.5, 10.0.0.5
Let's say 10.0.0.5 is the proxy. Nginx added this address, because the connection came from the proxy. You know this address and trust it to have always appended the address of the client it saw, so you remove it from the list. It is not the browser. You know, by virtue of the trust you have of this machine, that the next address to the left is true and correct and was added by the proxy.
After you remove 10.0.0.5, you find that 203.0.113.5 is now the rightmost address. Because you just removed a trusted address, you know that 203.0.113.5 is correct, but you do not recognize and thus cannot trust any further information to the left of this address. If 203.0.113.5 is a web proxy, belonging to an external organization, then the next address to the left, 192.168.1.5, might be the address of the browser behind that device, or it might be forged. Ultimately, it doesn't matter, because the rightmost untrusted address, 203.0.113.5 is the address you want. This is the address of the external machine that connected to your stack.
You may want to log or otherwise preserve the values further to the left, but you can't make any decisions based on anything but the rightmost remaining address, after removing any trusted address, because that is the only one that you can believe.
Importantly, you must stop further processing at the rightmost untrusted device address, because if a trusted address appears to the left of this point, then it is forged.
On any Heroku stack, I want to get the client's IP. my first attempt might be:
request.headers['REMOTE_ADDR']
This does not work, of course, because all requests are passed through proxies. So the alternative was to use:
request.headers['X-Forwarded-For']
But this is not quite safe, is it?
If it contains only one value, I take this. If it contains more than one value (comma-separated), I could take the first one.
But what if someone manipulates this value? I cannot trust request.headers['X-Forwarded-For'] as I could with request.headers['REMOTE_ADDR']. And there is no list of trusted proxies that I could use, either.
But there must be some way to reliably get the client's IP address, always. Do you know one?
In their docs, Heroku describes that X-Forwarded-For is "the originating IP address of the client connecting to the Heroku router".
This sounds as if Heroku could be overwriting the X-Forwarded-For with the originating remote IP. This would prevent spoofing, right? Can someone verify this?
From Jacob, Heroku's Director of Security at the time:
The router doesn't overwrite X-Forwarded-For, but it does guarantee that the real origin will always be the last item in the list.
This means that, if you access a Heroku app in the normal way, you will just see your IP address in the X-Forwarded-For header:
$ curl http://httpbin.org/ip
{
"origin": "123.124.125.126",
}
If you try to spoof the IP, your alleged origin is reflected, but - critically - so is your real IP. Obviously, this is all we need, so there's a clear and secure solution for getting the client's IP address on Heroku:
$ curl -H"X-Forwarded-For: 8.8.8.8" http://httpbin.org/ip
{
"origin": "8.8.8.8, 123.124.125.126"
}
This is just the opposite of what is described on Wikipedia, by the way.
PHP implementation:
function getIpAddress() {
if (isset($_SERVER['HTTP_X_FORWARDED_FOR'])) {
$ipAddresses = explode(',', $_SERVER['HTTP_X_FORWARDED_FOR']);
return trim(end($ipAddresses));
}
else {
return $_SERVER['REMOTE_ADDR'];
}
}
I work in Heroku's support department and have spent some time discussing this with our routing engineers. I wanted to post some additional information to clarify some things about what's going on here.
The example provided in the answer above just had the client IP displayed last coincidentally and that's not really guaranteed. The reason it wasn't first is because the originating request claimed that it was forwarding for the IP specified in the X-Forwarded-For header. When the Heroku router received the request, it just appended the IP that was directly connecting to the X-Forwarded-For list after the one that had been injected into the request. Our router always adds the IP that connected to the AWS ELB in front of our platform as the last IP in the list. This IP could be the original one (and in the case where there's only one IP, it almost certainly is), but the instant there are multiple IPs chained, all bets are off. Convention is always to add the latest IP in the chain to the end of the list (which is what we do), but at any point along the chain that chain can be altered and different IPs could be inserted. As such, the only IP that's reliable (from the perspective of our platform) is the last IP in the list.
To illustrate, let's say someone initiates a request and arbitrarily adds 3 additional IPs to the X-Forwarded-For header:
curl -H "X-Forwarded-For: 12.12.12.12,15.15.15.15,4.4.4.4" http://www.google.com
Imagine this machine's IP was 9.9.9.9 and that it had to pass through a proxy (e.g., a university's campus-wide proxy). Let's say that proxy had an IP of 2.2.2.2. Assuming it wasn't configured to strip X-Forwarded-For headers (which it likely wouldn't be), it would just tack the 9.9.9.9 IP to the end of the list and pass the request on to Google. At this point, the header would look like this:
X-Forwarded-For: 12.12.12.12,15.15.15.15,4.4.4.4,9.9.9.9
That request will then pass through Google's endpoint, which will append the university proxy's IP of 2.2.2.2, so the header will finally look like this in Google's logs:
X-Forwarded-For: 12.12.12.12,15.15.15.15,4.4.4.4,9.9.9.9,2.2.2.2
So, which is the client IP? It's impossible to say from Google's standpoint. In reality, the client IP is 9.9.9.9. The last IP listed is 2.2.2.2 though and the first is 12.12.12.12. All Google would know is that the 2.2.2.2 IP is definitely correct because that was the IP that actually connected to their service – but they wouldn't know if that was the initial client for the request or not from the data available. In the same way, when there's just one IP in this header – that is the IP that directly connected to our service, so we know it's reliable.
From a practical standpoint, this IP will likely be reliable most of the time (because most people won't be bothering to spoof their IP). Unfortunately, it's impossible to prevent this sort of spoofing and by the time a request gets to the Heroku router, it's impossible for us to tell if IPs in an X-Forwarded-For chain have been tampered with or not.
All reliability issues aside, these IP chains should always be read from left-to-right. The client IP should always be the left-most IP.
You can never really trust any information coming from the client. It's more of a question of who do you trust and how do you verify it. Even Heroku can possibly be influenced to provide a bad HTTP_X_FORWARDED_FOR value if they have a bug in their code, or they get hacked somehow. Another option would be some other Heroku machine connecting to your server internally and bypassing their proxy altogether while faking REMOTE_ADDR and/or HTTP_X_FORWARDED_FOR.
The best answer here would depend on what you're trying to do. If you're trying to verify your clients, a client-side certificate might be a more appropriate solution. If all you need the IP for is geo-location, trusting the input might be good enough. Worst case, someone will fake the location and get the wrong content... If you have a different use case, there are many other solutions in between those two extremes.
If I make a request with multiple X-Forwarded-For headers: curl -s -v -H "X-Forwarded-For: 1.1.1.1, 1.1.1.2, 1.1.1.3" -H "X-Forwarded-For: 2.2.2.2" -H "X-Forwarded-For: 3.3.3.3" https://foo.herokuapp.com/
> X-Forwarded-For: 1.1.1.1, 1.1.1.2, 1.1.1.3
> X-Forwarded-For: 2.2.2.2
> X-Forwarded-For: 3.3.3.3
The X-Forwarded-For header passed along to the app will be:
1.1.1.1, 1.1.1.2, 1.1.1.3, <real client IP>, 2.2.2.2, 3.3.3.3
so picking the last from that list does not hold up :/
In general on any non-HTTP server. Would there be a way to detect what domain was used to reach the IP?
I know HTTP servers get the domain passed within the request header, but would this be possible with any other server that does not require this information to be received from the client?
I'm especially looking for a way to do this with the minecraft server (Bukkit) so my preferred language (if needed for you to answer) would be Java. But I'd like to not have the theories about this language specific.
In general, no, which is why the HTTP protocol includes it in the headers.
In order to reach your server, first a DNS lookup is performed to resolve your IP, which is then followed by the connection itself. These two steps are separate, and hard to link together.
Logging what domain was last requested by a client is tricky, too, as DNS information is often cached, so the DNS request may not even reach your DNS server before being answered.
If it isn't cached, it also often isn't directly looked up by the end client, but rather by a caching DNS server operated, for instance, by the ISP.
No. The only way to get the DNS name used to connect to a server is to have the client provide it.
No, if there are no means for this in the protocol itself like the Host header in HTTP you cannot find out which hostname was used on the client to resolve your IP address.
I need to ask a question about HTTP protocol. I am trying to develop a sandbox (web browser) where any one can surf the website with different identities. Different identity means that on each request to a page will be from different IP address.
Now I don't know how scripts on web servers check the IP address of the one who generated the request. This is possible and I am aware of this. But I need to know whether this is HTTP request header that has the IP address or something else.
Simply speaking, I want to fool the websites. :)
Umair
Uh, the IP address is provided EVERY time you connect to ANYTHING. It has nothing to do with http headers.
See IPv4 -> packet structure -> header
You need to read up on the layers that build up a network from the wires[1] to the application. I think you'll find the the IP address is known long before HTTP gets involved.
See http://en.wikipedia.org/wiki/OSI_model
[1] or photons, or radio waves, or smoke signals...