DNS lookup and HTTP proxies - http

How are DNS lookups managed when using a proxy ? I tried to nslookup google.com and it is unable to find it from my local DNS so what is the path of DNS lookups with a simple HTTP proxy ?

It depends on what type of HTTP proxy and protocol you're using.
In most cases you're likely to encounter the web browser will pass the fully-qualified domain name to the HTTP proxy, which will perform the DNS lookup and forward the request (and potentially take other actions).
However, there are cases where your web browser will have to do its own DNS lookup. These include certain proxy types (e.g. "SOCKS 4") or where the hostname is on your browsers "bypass the proxy" list.
You system administrators may have configured your machine to do all DNS lookups via a proxy. There are a number of reasons for this: to minimize organizational DNS requests, to monitor all the DNS lookups in the organisation more easily, to route certain DNS queries as they see fit. If this is the case they will probably have removed DNS lookup abilities from your machine.

Related

Does internal communication between private servers use DNS and HTTPS?

I would like to know how internal communication links between private internal servers and a reverse proxy look.
When from my client (browser) I make a request to, say https://facebook.com, I hit Facebook's reverse proxy. I have two questions, when that reverse proxy gets a request and needs to forward it to the server that should handle it, does that sever it is forwarding the request to have a domain name or is it just an IP address ((user.facebook.com or useroffacebook.com v.s. 34.23.66.25 (DO NOT GO TO THAT ADDRESS I JUST MADE IT UP!!!)))? Also, does that connection use HTTP or HTTPS?
Like Kshitij Joshi already mentioned, it could be both.
A more detailed perspective for implementation:
reverse proxy should use IP addresses for routing so they are still working even if the DNS fails or is unavailable to the proxy for some reason.
internal traffic should also be encrypted (HTTPS). using plain text, even in internal networks, must be considered dangerous and is not recommended.
from my mindset you can replace the 'should' with a 'must'.

How are URLs mapped to their respective IP addresses in DNS?

What would be an explanation to a site being mapped to its IP address in DNS? I know inverse tree / resolver and name server are part of the process, but what are the actual steps?
They are not. The DNS does not deal with URL, which is a concept at level 4 of the Internet stack, that is the application protocol part, like HTTP here.
In the DNS you find domain names, host names, and IP addresses (both v4 and v6).
The browser extracts the hostname from the URL, resolves it to some IP, connects to it, if under HTTPS sends the hostname in SNI extension during TLS handshake, and then send the URL inside its first HTTP message, typically in part using the host header.
There is an URL record type in the DNS, but it is rarely used. In theory SRV records could also be used by browsers to find the proper server to connect to based on the hostname in the URL, but in practice browsers do not use it for various technical and non technical reasons.

Map DNS entry to specific port

Let's say I have this DNS entry: mysite.sample. I am developing, and have a copy of my website running locally in http://localhost:8080. I want this website to be reachable using the (fake) DNS: http://mysite.sample, without being forced to remember in what port this site is running. I can setup /etc/hosts and nginx to do proxing for that, but ... Is there an easier way?
Can I somehow setup a simple DNS entry using /etc/hosts and/or dnsmasq where also a non-standard port (something different than :80/:443) is specified? Without the need to provide extra configuration for nginx?
Or phrased in a simpler way: Is it possible to provide port mappings for dns entries in /etc/hosts or dnsmasq?
DNS has nothing to do with the TCP port. DNS is there to resolv names (e.g. mysite.sample) into IP addresses - kind of like a phone book.
So it's a clear "NO". However, there's another solution and I try to explain it.
When you enter http://mysite.sample:8080 in your browser URL bar, your client (e.g. browser) will first try to resolve mysite.sample (via OS calls) to an IP address. This is where DNS kicks in, as DNS is your name resolver. If that happened, the job of DNS is finished and the browser continues.
This is where the "magic" in HTTP happens. The browser is connecting to the resolved IP address and the desired port (by default 80 for http and 443 for https), is waiting for the connection to be accepted and is then sending the following headers:
GET <resource> HTTP/1.1
Host: mysite.sample:8080
Now the server reads those headers and acts accordingly. Most modern web servers have something called "virtual hosts" (i.e. Apache) or "sites" (i.e. nginx). You can configure multiple vhosts/sites - one for each domain. The web server will then provide the site matching the requested host (which is retreived by the browser from the URL bar and passed to the server via Host HTTP header). This is pure HTTP and has nothing to do with TCP.
If you can't change the port of your origin service (in your case 8080), you might want to setup a new web server in front of your service. This is also called reverse proxy. I recommend reading the NGINX Reverse Proxy docs, but you can also use Apache or any other modern web server.
For nginx, just setup a new site and redirect it to your service:
location mysite.example {
proxy_pass http://127.0.0.1:8080;
}
There is a mechanism in DNS for discovering the ports that a service uses, it is called the Service Record (SRV) which has the form
_service._proto.name. TTL class SRV priority weight port target.
However, to make use of this record you would need to have an application that referenced that record prior to making the call. As Dominique has said, this is not the way HTTP works.
I have written a previous answer that explains some of the background to this, and why HTTP isn't in the standard. (the article discusses WS, but the underlying discussion suggested adding this to the HTTP protocol directly)
Edited to add -
There was actually a draft IETF document exploring an official way to do this, but it never made it past draft stage.
This document specifies a new URI scheme called http+srv which uses a DNS SRV lookup to locate a HTTP server.
There is an specific SO answer here which points to an interesting post here

How does Proxifier resolve hostnames through proxy?

What technique is Proxifier using to resolve hostnames through proxy? All other solutions I've found on the Internet offers DNS through socks just like what Badvpn/Tun2Socks does. But Proxifier can work even through a http proxy and the only thing you need is that your proxy server support DNS (Squid for example). Their explanation is very brief saying "Proxifier has to assign placeholder (fake) IP addresses". But what that mean exactly?
Note: As you know DNS queries are UDP by default and can't be forwarded through http proxy naturally. Browsers are another examples which do name resolution through proxy when are set to use it.
Assigning fake IP in response to DNS request is different from returning truncated DNS answer.
Here's the related RFC about assigning fake IPs in response to DNS request in order to get the domain name and pass it to the remote proxy server afterward: https://www.rfc-editor.org/rfc/rfc3089
I found the answer. Redsocks has implemented this indeed, as it says:
Redsocks includes `dnstc' that is fake and really dumb DNS server that
returns "truncated answer" to every query via UDP. RFC-compliant
resolver should repeat same query via TCP in this case - so the
request can be redirected using usual redsocks facilities.

Get domain the server was reached over?

In general on any non-HTTP server. Would there be a way to detect what domain was used to reach the IP?
I know HTTP servers get the domain passed within the request header, but would this be possible with any other server that does not require this information to be received from the client?
I'm especially looking for a way to do this with the minecraft server (Bukkit) so my preferred language (if needed for you to answer) would be Java. But I'd like to not have the theories about this language specific.
In general, no, which is why the HTTP protocol includes it in the headers.
In order to reach your server, first a DNS lookup is performed to resolve your IP, which is then followed by the connection itself. These two steps are separate, and hard to link together.
Logging what domain was last requested by a client is tricky, too, as DNS information is often cached, so the DNS request may not even reach your DNS server before being answered.
If it isn't cached, it also often isn't directly looked up by the end client, but rather by a caching DNS server operated, for instance, by the ISP.
No. The only way to get the DNS name used to connect to a server is to have the client provide it.
No, if there are no means for this in the protocol itself like the Host header in HTTP you cannot find out which hostname was used on the client to resolve your IP address.

Resources