Why is `to.` a valid domain name? - networking

In visiting http://to./ you are given a legitimate website.
Is to. a valid domain name then, despite not ending with a TLD and having a superfluous period? Why?
Being valid, what would its DNS hierarchy be?

The final dot is part of the fully qualified domain name. More information in this article. Specifically:
It's a little-known fact, but fully-qualified (unambiguous) DNS domain names have a dot at the end. People running DNS servers usually know this (if you miss the trailing dots out, your DNS configuration is unlikely to work) but the general public usually doesn't. A domain name that doesn't have a dot at the end is not fully-qualified and is potentially ambiguous.

to is the TLD of Tonga.
There is no spec that says that a domain name must have something other than a TLD; Tonga is the only TLD that has an A record for the TLD itself.
However, most browsers will not recognize a domain name that doesn't contain a period, so they use the full FQDN, with a trailing ..

The DNS represents a hierarchy of domain names. As T. pointed out, if you see a dot at the end of a FQDN, it just represents the upper root point of the whole domain name tree.
In the context of web browsers, they tend to be graceful and hide this detail from end users.

Related

Dealing with multiple sub-sub domains as single hostname in GTM

My company has recently changed the way we deal with affiliates in the URL structure.
The old format: subdomain.website.com/page/?a=affiliate
with hostname subdomain.website.com
New format: affiliate.subdomain.website.com/page with hostname affiliate.subdomain.website.com
Is there a way (ideally in Google Tag Manager) to change the hostname being passed such that the affiliate part (if present) is stripped out?
More info: Not all visitors to the website come from affiliates. The issue I am facing is that the hostname for affilaites being passed in the data layer is different to what it used to be and is unique for each affiliate that we work with. This causes lookup tables (e.g. for analytics ID) not to work as they don't match the hostname anymore. It would be impossible to keep the lookup tables updated for each new affiliate added so I'm looking for a way to strip the affiliate part of the hostname to keep all hostnames consistent on this journey.
Sure. In your Google Analytics settings (either in your tag, or, if you use that, in your settings variable you can set the hostname field in the "fields to set" section.
If you have a single domain you just could set a constant value for the hostname. If you still have (non-affiliates) subdomains you need to track you still need a variable for that, and since it is not feasible to have a lookup table for all affiliate values you would go the other way round and whitelist the hostnames you want to keep (e.g. creating a lookup table with hostname as input that returns your cleaned up hostname as default value, and the original input for the hostnames you want to keep):
The above will return the hostname as entered for the three subdomains, and a default for everything else, thus creating a whitelist.

If there exists a dot after ".com", is it a valid URL?

I came across a few URLs which also render with or without a dot/period after .com, while some do not.
For example:
www.example.com.
Should the URL render normally if a dot/period is added after .com or should it go to a 404 page?
As said in comment this great resource, solves many of your queries, including a portion below specific to your query:
Fully-Qualified Domain Names
When I double-click a Bonjour (DNS-SD) Name in a web browser like Safari, the resulting URL has a hostname with a dot at the end. Is this a bug?
No, the dot at the end is correct.
You can try it here. Try adding a dot at the end of www.dns-sd.org, as shown in the subtitle at the top of this page, and you should still get the same page.
It's a little-known fact, but fully-qualified (unambiguous) DNS domain names have a dot at the end. People running DNS servers usually know this (if you miss the trailing dots out, your DNS configuration is unlikely to work) but the general public usually doesn't. A domain name that doesn't have a dot at the end is not fully-qualified and is potentially ambiguous. This was documented in the DNS specification, RFC 1034, way back in 1987:
Since a complete domain name ends with the root label, this leads to a
printed form which ends in a dot. We use this property to distinguish between:
a character string which represents a complete domain name
(often called "absolute"). For example, poneria.ISI.EDU.
a character string that represents the starting labels of a
domain name which is incomplete, and should be completed by
local software using knowledge of the local domain (often
called "relative"). For example, "poneria" used in the
ISI.EDU domain.
How this affects web browsing
The people defining the HTTP protocol understood this issue, and RFC 1738 specifies clearly that the part of a URL is supposed to contain a fully qualified domain name:
3.1. Common Internet Scheme Syntax
//<user>:<password>#<host>:<port>/<url-path>
host
The fully qualified domain name of a network host
Unfortunately, the people implementing web browser clients appeared not to understand what this meant. When you access a web site, the value most web browsers put in the "Host:" field is what the user typed, not what the computer actually ended up using, after applying the DNS user's searchlist to constuct a fully-qualified name from the partial name. For example, here are three different ways the user may refer to the host "www.example.com."
www.example.com. — Absolute domain name
www.example.com — Relative domain name, which, after applying the "." that's always implicitly in everyone's DNS searchlist, becomes www.example.com.
www with "example.com" in DNS searchlist — user types "www" and gets
www.example.com.
When sending the Host: parameter to the web server, the web browser client puts in what the user typed (www.example.com., www.example.com, or www) instead of what the client ended up actually looking up in DNS (www.example.com. in all three cases). Unfortunately the Apache web server (at least in some versions) doesn't recognise that all those three names are just three different ways of referring to the same host.
If you're a web site administrator setting up a web site using Apache "VirtualHost" directives or similar, you need to have a ServerAlias line listing all the things the user might type to get to that web site (typically the first label, the whole name without a trailing dot, and the whole name with a trailing dot, as shown in the example above).
See: http://www.dns-sd.org/trailingdotsindomainnames.html
And the old RFC it links to: http://www.ietf.org/rfc/rfc1034.txt
Truly fully qualified domain names have a period after the TLD, but unless you're managing a DNS server you almost never come across them. It is however something you might want to consider if you were for instance writing an HTTP server varying on hostname.
A period at the end of a hostname is an indicator that the resolver should not attempt to use its search domains in order to resolve the hostname if the given name does not resolve. That is, if the resolver has a search domain of "lan", if you attempt to look up "web" it would first try resolving "web" followed by "web.lan", but with "web." it would only try "web".
As for the server, it never sees the URL, only the hostname and path (as separate entities), and there is no reason for it to complain if the Host header includes the period (although there is also no reason for the client to include it).

Why can foo.example.com set cookies for example.com?

From both the documentation and this link, I already know that the fact is foo.example.com can set cookies for example.com by sending response with Domain = example.com in the Set-Cookie header. But why is this allowed?
For example, the fact is, a server (say, foo.example.com) cannot set cookies for its siblings (say, bar.example.com) or the domain names lower than it (also known as "its child", say, ide.foo.example.com), but it can set cookies for the domain names higher than it (also known as "its parents", in this case example.com.
Let me make the statement of the question even more clear by putting it into the real world. Just like apps on Google App Engine, foo.appspot.com obviously cannot set cookies for bar.appspot.com because they are two different apps, and they shouldn't affect each other's behavior. But why is it allowed for foo.appspot.com to set cookies for appspot.com by sending Domain = appspot.com in its response header? By doing this the foo.appspot.com app can actually affect other apps' behavior on Google App Engine, since the browser will send this cookie when visiting bar.appspot.com, the domain name of which is a child of appspot.com.
I learned all these things about cookies from the Web Development course on Udacity. But I'm really confused with this question. Can anybody help explain this? Thanks in advance. :-)
The link you provided is horribly outdated. Too bad people googling "cookie domain" will find it first.
I should write a better one; but for now, to quickly answer your question - it is about "public suffix" domain.
Can server "example.com" set a cookie for "com"? Nope, because "com" is a public suffix.
Can "foo.co.uk" set a cookie for "co.uk"? Nope, because "co.uk" is a public suffix.
It happens that "appspot.com" is also a public suffix; so "foo.appspot.com" cannot set a cookie with domain="appspot.com". (It can, but browsers will reject it)
Unfortunately, there's no algorithm to determine which is a public suffix. The list of all public suffix is maintained manually at https://publicsuffix.org/

Is it ok to use http:// inside an URL body?

As far as I understand, an URL consists of the folowing fields:
Protocol (http, https, ftp, etc.)
User name
User Password
Host address (an IP address or a DNS FQDN)
Port (which can be implied)
Path to a document inside the server documents root
Set of arguments and values
Document part (#)
as
protocol://user:password#host:port/path/document?arg1=val1&arg2=val2#part
But I've just met an example of using "http://" inside the path part: there is a redirection service (showing ads and paying money for traffic you route through it) which just adds a target URL (in full form, with "http://") to its own. Is it considered ok from standards point of view? Doesn't it break anything? Normally I'd never expect to meet "//" double slash, a colon or a "#" inside a valid URL but on the places they are in the example above.
No, it is not okay from a standards perspective.
Per Section 3.3 Path Component in RFC-2396, path cannot contain the following characters - "/", ";", "=", and "?"
Usually, browsers encode such malformed URIs before making the http request, which is why it works in practice.

cookie domain getting changed, only one cookie with '.' one without

I am explicitly setting a cookie domain so it is shared between the domain and a sub domain. Think mysite.com and payment.mysite.com. Sometimes I get two session cookies when I only have one specified. When looking in firefox the domains on the cookies are different, one is "mysite.com" and the other is ".mysite.com" how does this happen? I am setting the domain to mysite.com but it is trimmed from one.
I am using asp.net.
Thanks
It depens what you specify as a domain in setcookie function. Please take a look at the description in here http://php.net/setcookie.

Resources