Query string after the domain name - http

I am trying to add a query string at the end of URL for a hyperlink control as follows
HyperLink testLink = new HyperLink();
testLink.NavigateUrl = "http://www.example.com" + "?siteId=asd343s32kj343dce";
But when this is rendered in the browser it is displaying as
http://www.example.com/?siteId=asd343s32kj343dce (/ char after the .com).
And if the testLink.NavigateUrl = "http://www.example.com/abc.aspx" + "?siteId=asd343s32kj343dce";
Then the link is rendered correctly as http://www.abcd.com/abc.aspx?siteId=asd343s32kj343dce (No extra characters).
Am I missing any thing? Please advice.
Thank you,
Krishna.

The browser is correcting the URL for you by assuming that there should be a slash after the domain name. You might run into problems with browsers that doesn't do this, so you should correct the URL to:
testLink.NavigateUrl = "http://www.abcd.com/" + "?siteId=asd343s32kj343dce";
The reason that the slash should be after the domain name is that the domain name itself can not be a resource. The domain name just specifies the web site, the URL has to have something that specifies a resource on that site, and the slash specifies the default page in the root folder of the site.

this is normal, the / tell that the domain name ended and you are now inside the structure of the website (root context in this case).
the second one is normal because abc.aspx is a webpage and it can accept querystring. a domain cannot accept a querystring.

An HTTP URL takes the form:
http://<host>:<port>/<path>?<searchpart>
where <host> and <port> are as described in Section 3.1. If :<port>
is omitted, the port defaults to 80. No user name or password is
allowed. <path> is an HTTP selector, and <searchpart> is a query
string. The <path> is optional, as is the <searchpart> and its
preceding "?". If neither <path> nor <searchpart> is present, the "/"
may also be omitted.
https://www.rfc-editor.org/rfc/rfc1738#section-3.3

Although http://example.com?query is a valid URI. The normalization of HTTP URIs states that http://example.com?query and http://example.com/?query are equal:
[…] because the "http" scheme makes use of an authority component, has a default port of "80", and defines an empty path to be equivalent to "/", the following four URIs are equivalent:
http://example.com
http://example.com/
http://example.com:/
http://example.com:80/
In general, a URI that uses the generic syntax for authority with an empty path should be normalized to a path of "/".

Related

A good name for a URL path and query string and fragment?

URLs can be broken up into these components:
Sample URL: http://www.server.com:8080/path?query=string#fragment
protocol = http
host = www.server.com
port = 8080
path = /path
query = ?query=string
fragment = #fragment
Is there an established name for everything that comes after port (path, query, and fragment)? I was tempted to just call this "path" but that is not a good name as it does not include the query and fragment.
The IETF URI Spec defines the combination of these elements (path, query, and fragment) as a Relative Reference.
Perhaps "relative URL", as in "relative to Site Root"
The URL structure you are giving is containing the basic elements, and the fragment is usually coming before the query, the query in general is used when you are listing a part of data, so you need to access to it by this query, and parameters if needed.
Here a good review to list the steps for creating a good url.
there is no established name for everything together due to them all being different. Also you forgot the parameters part of the url. The parameters go in the query.

Is a URL with only scheme + path valid?

I know absolute path-only URLs (/path/to/resource) are valid, and refer to the same scheme, host, port, etc. as the current resource. Is the URL still valid if the same (or a different!) scheme is added? (http:/path/to/resource or https:/path/to/resource)
If it is valid according to the letter of the spec, how well do browsers handle it? How well do developers that may come across the code in the future handle it?
Addendum:
Here's a simple test case I set up on an Apache server:
resource/number/one/index.html:
link
resource/number/two/index.html:
two
Testing in Chrome 43 on OS X: The URL displayed when hovering over the link looks correct. Clicking the link works as expected. Looking at the DOM in the web inspector, hovering over the a href URL displays an incorrect location (/resource/number/one/http:/resource/number/two/).
Firefox 38 appears to also handle the click correctly. Weird.
No, it’s not valid. From RFC 3986:
4.2. Relative Reference
A relative reference takes advantage of the hierarchical syntax
(Section 1.2.3) to express a URI reference relative to the name space
of another hierarchical URI.
relative-ref = relative-part [ "?" query ] [ "#" fragment ]
relative-part = "//" authority path-abempty
/ path-absolute
/ path-noscheme
/ path-empty
The URI referred to by a relative reference, also known as the target
URI, is obtained by applying the reference resolution algorithm of
Section 5.
A relative reference that begins with two slash characters is termed
a network-path reference; such references are rarely used. A
relative reference that begins with a single slash character is
termed an absolute-path reference. A relative reference that does
not begin with a slash character is termed a relative-path reference.
A path segment that contains a colon character (e.g., "this:that")
cannot be used as the first segment of a relative-path reference, as
it would be mistaken for a scheme name. Such a segment must be
preceded by a dot-segment (e.g., "./this:that") to make a relative-
path reference.
where path-noscheme is specifically a path that doesn’t start with / whose first segment does not contain a colon, which addresses your question pretty specifically.

Is a URL with // in the path-section valid?

I have a question regarding URLs:
I've read the RFC 3986 and still have a question about one URL:
If a URI contains an authority component, then the path component
must either be empty or begin with a slash ("/") character. If a URI
does not contain an authority component, then the path cannot begin
with two slash characters ("//"). In addition, a URI reference
(Section 4.1) may be a relative-path reference, in which case the
first path segment cannot contain a colon (":") character. The ABNF
requires five separate rules to disambiguate these cases, only one of
which will match the path substring within a given URI reference. We
use the generic term "path component" to describe the URI substring
matched by the parser to one of these rules.
I know, that //server.com:80/path/info is valid (it is a schema relative URL)
I also know that http://server.com:80/path//info is valid.
But I am not sure whether the following one is valid:
http://server.com:80//path/info
The problem behind my question is, that a cookie is not sent to http://server.com:80//path/info, when created by the URI http://server.com:80/path/info with restriction to /path
See url with multiple forward slashes, does it break anything?, Are there any downsides to using double-slashes in URLs?, What does the double slash mean in URLs? and RFC 3986 - Uniform Resource Identifier (URI): Generic Syntax.
Consensus: browsers will do the request as-is, they will not alter the request. The / character is the path separator, but as path segments are defined as:
path-abempty = *( "/" segment )
segment = *pchar
Means the slash after http://example.com/ can directly be followed by another slash, ad infinitum. Servers might ignore it, but browsers don't, as you have figured out.
The phrase:
If a URI does not contain an authority component, then the path cannot begin
with two slash characters ("//").
Allows for protocol-relative URLs, but specifically states in that case no authority (server.com:80 in your example) may be present.
So: yes, it is valid, no, don't use it.

URL without "http|https"

I just learned from a colleague that omitting the "http | https" part of a URL in a link will make that URL use whatever scheme the page it's on uses.
So for example, if my page is accessed at http://www.example.com and I have a link (notice the '//' at the front):
Google
That link will go to http://www.google.com.
But if I access the page at https://www.example.com with the same link, it will go to https://www.google.com
I wanted to look online for more information about this, but I'm having trouble thinking of a good search phrase. If I search for "URLs without HTTP" the pages returned are about urls with this form: "www.example.com", which is not what I'm looking for.
Would you call that a schemeless URL? A protocol-less URL?
Does this work in all browsers? I tested it in FF and IE 8 and it worked in both. Is this part of a standard, or should I test more browsers?
Protocol relative URL
You may receive unusual security warnings in some browsers.
See also, Wikipedia Protocol-relative URLs for a brief definition.
At one time, it was recommended; but going forward, it should be avoided.
See also the Stack Overflow question Why use protocol-relative URLs at all?.
It is called network-path reference (the part that is missing is called scheme or protocol) defined in RFC3986 Section 4.2
4.2 Relative Reference
A relative reference takes advantage of the hierarchical syntax
(Section 1.2.3) to express a URI reference relative to the name space
of another hierarchical URI.
relative-ref = relative-part [ "?" query ] [ "#" fragment ]
relative-part = "//" authority path-abempty
/ path-absolute
/ path-noscheme
/ path-empty
The URI referred to by a relative reference, also known as the target URI, is obtained by applying the reference resolution
algorithm of Section 5.
A relative reference that begins with two slash characters is
termed a network-path reference (emphasis mine); such references are rarely used.
A relative reference that begins with a single slash character is termed an absolute-path reference. A relative reference that does not begin with a slash character is termed a relative-path reference.
A path segment that contains a colon character (e.g., "this:that") cannot be used as the first segment of a relative-path reference, as it would be mistaken for a scheme name. Such a segment must be preceded by a dot-segment (e.g., "./this:that") to make a relative- path reference.

Correct way to append a query string param when at the top level of a domain?

What is the correct syntax for appending a query string parameter when I'm currently at the "root" of my site?
1) http://www.example.com/?foo=bar
2) http://www.example.com?foo=bar
3) http://www.example.com/&foo=bar
I'm certain #3 is completely wrong, but I'm unsure about 1 or 2. I've never come across this scenario, it's always been appended to a filename with it's extension -- this odd use case came up on our team and I was left scratching my head.
Number one is the only correct one. See RFC1738 3.3:
An HTTP URL takes the form:
http://<host>:<port>/<path>?<searchpart>
where <host> and <port> are as described in Section 3.1. If :<port>
is omitted, the port defaults to 80. No user name or password is
allowed. <path> is an HTTP selector, and <searchpart> is a query
string. The <path> is optional, as is the <searchpart> and its
preceding "?". If neither <path> nor <searchpart> is present, the "/"
may also be omitted.

Resources