How does a webserver know what website you want to access? - http

Apache has something called VirtualHosts.
You can configure it in that way that when you go to example.com get a different site than example2.com even if you use the same IP's.
A HTTP Request looks something like this:
GET /index.html HTTP/1.0
[some more]
How does the server know you are trying to access www.example.com or www.example2.com?

In addition to the GET line, the browser sends a number of headers. One of these headers is the Host header, which specifies which host the request is targeted at.
A simple example request could be:
GET /index.html HTTP/1.0
Host: example.com
This indicates that the browser wants whatever is at http://example.com/index.html, and not what is at http://example2.com/index.html.
Further information:
The Host header in the HTTP specification

IIS also has this and I believe refers to it as host header redirection.
The http packet header contains the destination hostname which the server uses to determine which website to serve up. Some more reading: http://www.it-notebook.org/iis/article/understanding_host_headers.htm

Related

How does proxy server know the target domain of the client?

I'm currently writing a proxy server in nodejs. To proceed, I need to know how to reliably determine the originally intended domain of the client. When a client is configured to use a proxy, is there a universal way that the client sends this information (e.g. one of the two examples below), or is it application specific (e.g. Chrome proxy settings may do it differently to IE proxy settings, which may be different to a configuration for a proxy for an entire Windows machine, etc.)?
An HTTP request to the proxy server could look something like this, which would suffice:
GET /something HTTP/1.1
Host: example.com
...
In this case, the proxy could get the hostname from the 'Host' header, get the path in the first line of the HTTP request, and then have sufficient information.
It could also look something like this, which would suffice:
GET http://example.com/something HTTP/1.1
...
with a FQDN in the URL, in which case the proxy could just retrieve the path of the HTTP request in the first line.
Any information regarding this would be greatly appreciated! Thanks in advance for the help!

Can I redirect a request with code 302 to another port without using the HOST variable?

In the HTTP protocol, there are multiple status codes that can be used to redirect a request to another URL, such as 301 Moved Permanently or 302 Found. To my knowledge, the target URL can either contain a host (http://example.com/example.html) or let the host implicitly be the current host (/example.html).
When using the first form, one can redirect to a non-standard port (http://example.com:8080/example.html). How can this be done when not specifying the host?
Currently, I parse the HOST request header and build the new URL. But AFAIK, that header is not strictly required to be sent, so I want to avoid it.
You can't specify just the port in a redirect. And yes, the "Host" header field is strictly required in HTTP/1.1.

Nginx - Access Http custom headers

I have a simple requirement. I have a nginx web server and a netscaler proxy. From netscaler, the option Client_IP header is checked, and name of header is HTTP_CLIENT_IP.
I want to access this ip in nginx log. I have specified a custom log format, so i can access this value:
I have tried the following variables in the log format, and they just return in '-'.
$http_client_ip
$http_request_body
Basically, i want to read the entire request header / body that nginx receives from netscaler.
Any help would be appreciated !
Netscaler inserts a http header with the client ip, if enabled. However You have to configure the http header name on the netscaler.

Do resources in a URL path have their own IP address?

So, a DNS server recognizes https://www.google.com as 173.194.34.5
What does, say, https://www.google.com/images/srpr/logo11w.png look like to a server? Or are URL strings machine readable?
Good question!
When you access a url, first a DNS lookup will be done on the host part (www.google.com), after that the browser will look at the protocol and connect using that (https in this case).
After connecting, the browser will tell the server:
"Hi! I'm trying to connect to www.google.com and I would like the resource /images/srpr/logo11w.png). This looks like this on the protocol:
GET /images/srpr/logo11w.png HTTP/1.1
Host: www.google.com
The Host part is a HTTP header. There are usually more headers.
So the short answer is:
The server will get access to both the hostname, and the full path the browser tried to access.
https://www.google.com/images/srpr/logo11w.png
consists of several parts
protocol (https)
address of the server (www.google.com, that gets translated to IP)
path to the resource (/images/srpr/logo11w.png, in this example it seems like it would be an image in a directory srpr, which is in a directory images in the root of the website)
The server processes path to the resource the user requested (via GET method) based on various rules and returns a response.

requesting a domain but going to another host

so i am in a place where i have access to only 5 websites and i am trying to bypass this restrection
when i try to browse any of those website i don't have any problem, for example stackoverflow.com , but i can't access 1.1.1.1 (which is the ip of stackoverflow)
it means that what ever is blocking the other website allow only those 5 domains
is there anyway i can sumbit a web request to 2.2.2.2 but in the headers i am requesting stackoverflow.com to bypass this restrection
i have no idea how does dns or a simple http request work , i aperciate any idea to start with or at least something to read
also i can't change my dns servers
You can try it with any telnet application:
telnet google.com 80
GET / HTTP/1.1
Host: stackoverflow.com
End your request with double enter.
If you receive html then the proxy is letting the request pass.

Resources