Does a HTTP response indicate the address of the receiver or something similar? - http

A HTTP request indicates the resource on the server by the URL in the start line and the HOST header.
Does a HTTP response indicate the address of the receiver or something similar?
If not, why is it not necessary?
Thanks.

Internet protocols are layered.
HTTP requests are wrapped in TCP packages, which are wrapped in IP.
The outside IP packet contains information about who is the receipient and who is the sender of a message. Based on this information a TCP/IP service knows where to send the message back to.
The Host header was actually a later addition to HTTP. It wasn't really needed before because it was safer to assume that a single ip address would have a single HTTP service. The Host header was added because people needed many different domains to be served from a smaller set of ip addresses and send different responses based on what the domain was.
Without the Host header it wouldn't have been possible to know which domain the user wanted because the ip packet only encodes the ip address, not which domain was used to find the ip.

Related

How does a server know which domain name was used?

As far as i know what we get from a dns query is a ip address. So in the end of the day if thats true we are still using ip addresses to connect the server and domains are pretty names for them.
So how does a server know which domain i used to query that ip address?
How does vhosts work an understand that if the domain data is lost during dns query?
The Internet works in layers. Each layer uses different kind of parameters to do its work.
Layer 3 is typically IP aka Internet Protocol. To work it uses IP addresses, each computer has at least one to be able to discuss with another one. And there are two families in fact: version 4 and version 6.
Since multiple services can be on any given computer at some point, you need a layer on top of that, layer 4, that deals with transport. The "predominant" one is TCP aka Transport Control Protocol, but there is also UDP. TCP and UDP uses ports: a 2 bytes integer that encodes for a specific protocol.
For example, HTTP was given port number 80 (completely arbitrary), and HTTPS port 443.
The DNS, which itself uses UDP and TCP (on port 53), allows, among other things, to map a given hostname to a given IP address or multiple IP addresses. This is the typical A and AAAA records. There is also a CNAME record that maps one domain name to another. There also exists a SRV record that maps a service (which is a protocol name + a transport) to a given hostname and port number.
When one computer connects to another, its first step for all the above is to find out which IP address to use to connect to. It can use the DNS for that. Typically it will get only the IP address, but, depending on the protocol (layer above 4), may also get a port (if using SRV records).
The HTTP world does not use SRV records. So a browser just uses the hardcoded 80 or 443 ports, or the port number appearing in the URL.
Then we are at the transport level, let us say TCP.
The connection is done (since now the remote IP address and port are known) and the protocol above TCP, like HTTP, is free to convey any kind of extra data, such as the hostname that the client initially used (as taken from the URL) to find out the IP address.
This is done through the HTTP host header, see RFC 2616
Note that if you do things through TLS (which conceptually sits between TCP and HTTP) there is even something else happening: SNI or Server Name Indication.
When doing the TLS handshake, so before any kind of HTTP headers or content, the client will send the final hostname desired in some specific TLS message. Why? So that the server can find which specific certificate it should answer which as otherwhise it would not be able to know which hostname is requested as this sits in some HTTP header which do not exist until the TLS handshake is finished.
A webserver will be able to see both the SNI content to find out which certificate to send back and then the host header to find out which VirtualHost (in Apache) section is relevant to the query being processed.
If you are not in HTTP world, then it all depends on the protocol used. Older protocols, like FTP, did not plan for "multihoming" at the beginning, a given IP address meant only one hostname and service for example.

Ensure http response goes to $_SERVER['REMOTE_ADDR']

I want to ensure people who use my site are who they say they are. Not who they say, they say, they are.
How do I ensure my data is going back to the IP given by $_SERVER['REMOTE_ADDR']?
Or is it automatic that the http response is sent there?
How do I ensure my data is going back to the IP given by
$_SERVER['REMOTE_ADDR']?
The client IP is handled by the TCP protocol. The REMOTE_ADDR property is populated by the client address of the TCP connection. It's not part of the HTTP protocol. So it is guaranteed that your application is talking to this IP.
This doesn't mean that the IP that you are seeing is the actual IP of the end-user (as attributed for example by his internet provider). There could be proxies or intermediate devices between him and your application. So basically what you will be seeing is the closest IP to your web server in this chain.

How can I spoof the sender IP address using curl?

I need to make a request with a spoofed IP address for testing purposes. What's the easiest way to do this?
For my own purposes, changing the HTTP header was enough, via the following:
curl --header "X-Forwarded-For: 1.2.3.4" "http://www.foobar.com"
You can't.
In general, spoofing IP addresses for TCP is remarkably difficult. Unless you have control of a router quite near your target or near the IP you're spoofing, consider it impossible.
The reply packets need a path back to you in order to complete even the three-way handshake. The most reliable way to do this is to have control over a router in the most common pathway between your target and your spoofed IP address: this would let you capture packets between the target and the spoofed address and forward them on to you.
You could also try injecting bogus BGP route advertisements, but doing so would doubtless be noticed and cost you dearly when your peers drop you completely.
Can I make libcurl fake or hide my real IP address?
No. libcurl operates on a higher level. Besides, faking IP address
would imply sending IP packet with a made-up source address, and then
you normally get a problem with receiving the packet sent back as they
would then not be routed to you!
If you use a proxy to access remote sites, the sites will not see your
local IP address but instead the address of the proxy.
Also note that on many networks NATs or other IP-munging techniques
are used that makes you see and use a different IP address locally
than what the remote server will see you coming from.

Do all web requests contain the requestor's IP?

Am I able to depend on a requestor's IP coming through on all web requests?
I have an asp.net application and I'd like to use the IP to identify unauthenticated visitors. I don't really care if the IP is unique as long as there is something there so that I don't get an empty value.
If not I guess I would have to handle the case where the value is empty.
Or is there a better identifier than IP?
You can get this from Request.ServerVariables["REMOTE_ADDR"].
It doesn't hurt to be defensive. If you're worried about some horrible error condition where this isn't set, check for that case and deal with it accordingly.
There could be many reasons for this value not to be useful. You may only get the address of the last hop, like a load balancer or SSL decoder on the local network. It might be an ISP proxy, or some company NAT firewall.
On that note, some proxies may provide the IP for which they're forwarding traffic in an additional HTTP header, accessible via
Request.ServerVariables["HTTP_X_FORWARDED_FOR"]. You might want to check this first, then fall back to Request.ServerVariables["REMOTE_ADDR"] or Request.UserHostAddress.
It's certainly not a bad idea to log these things for reference/auditing.
I believe that this value is set by your web sever and there is really no way to fake it as your response to there request wouldn't be able to get back to them if they set there IP to something else.
The only thing that you should worry about is proxies. Everyone from a proxy will get the same IP.
You'll always get an IP address, unless your web server is listening on some sort of network that is not an IP network. But the IP address won't necessarily be unique per user.
Well, web request is an http connection, which is a tcp connection and all tcp connections have two endpoints. So, it always exists. But that's about as much as you know about it. It's neither unique nor reliably accurate (with all the proxies and stuff).
Yes, every request must have an IP address, but as stated above, some ISP's use proxies, NAT or gateways which may not give you the individual's computer.
You can easily get this IP (in c#) with:
string IP = Context.Request.ServerVariables["REMOTE_ADDR"].ToString();
or in asp/vbscript with
IP = request.servervariables("REMOTE_ADDR")
IP address is not much use for identifying users. As mentioned already corporate proxies and other private networks can appear as a single IP address.
How are you authenticating users? Typically you would have them log in and then store that state in their session in your app.

How does client-machine/browser handle unrequested HTTP response?

Imagine the following:
User goes to script (http://sample.org/test.php),
Script sends an HTTP request to some other page (http://google.com/). For this example, we'll say using curl.
The script sets the IP address of the request to the user's IP, via CURLOPT_INTERFACE.
I know already that the requesting script will not receive the response, as the remote-host will send any responses to the IP address given in the request.
What I am wondering is what happens to this response? Assuming the client is on a LAN that has one external address and that all traffic sent to that IP is handled by a router acting as a DHCP server, will the response even get back to the user's machine? If it did, would there be any way to ensure that it was handled by the user's browser? And if so, how would the browser handle this, typically? Would it open a new window with Google in it?
I definitely have a follow up to this question, but I am very curious what goes on at this level, before I experiment further.
The script sets the IP address of the request to the user's IP, via CURLOPT_INTERFACE.
Usually, this won't work. Your ISP knows which IP address you are supposed to have and will not forward traffic coming from "fake" IP addresses.
In particular, since you can only communicate one-way with a fake IP (since the answer won't reach you), you would not be able to establish a working TCP connection, since TCP requires a three-way handshake. Thus, you wouldn't be able to submit your web request.
What I am wondering is what happens to this response? Assuming the client is on a LAN that has one external address and that all traffic sent to that IP is handled by a router acting as a DHCP server, will the response even get back to the user's machine?
If the user's PC has an internal IP address and uses NAT, the router will not know which LAN machine to forward the packet to (since it did not see any outgoing request to which it could match that response). Therefore, the answer would be dropped.
Even if you could get the response to reach the client:
If it did, would there be any way to ensure that it was handled by the user's browser?
No. As stated above, a TCP request consists of a three-way handshake. This handshake has not been completed, so the operating system would just drop the packet.
CURLOPT_INTERFACE is for use on computers that have multiple IP addresses assigned to them, to specify which of those addresses should be used as the source IP for the connection. You can't use it to spoof some other computer's IP address. Most likely you'll either get an error, or the option will be ignored and the OS will choose a source interface automatically (the default behavior).
The response will be returned on the same TCP connection as the request.

Resources