Can RSS GUIDs contain hashes? - rss

Is it OK to use the hash character in the GUID field in an RSS field, and if so are thing#1 and thing#2 considered to be separate GUIDs?

From the RSS 2.0 Spec, <guid> sub-element of <item>:
There are no rules for the syntax of a guid. Aggregators must view them as a string. It's up to the source of the feed to establish the uniqueness of the string.

"Aggregators must view the guid as a string. There are no rules for the syntax. It's up to the creator of the RSS document, to establish the uniqueness of the string." Source: w3schools

Related

What Is the Use of the Random Characters in the Query String Section of a URL to a CSS File?

I often see URLs in the following format:
revslider/rs-plugin/css/settings.css?nzs0qa
What does the query string ?nzs0qa mean?
nzs0qa is a ramdom string. The purpose of the "?nzs0qa" is to force your browser to download the latest version of the css. You know browser will keep the static files such as css, js, etc in cache. If you update the css, but your visitors use old version, it might be some mistake.
That's all.
The part of the question mark is called the query part. The question mark itself is not part of the query.
I suggest reading about URL format.
From Wikipedia:
https://en.wikipedia.org/wiki/Uniform_Resource_Locator
An optional query, separated from the preceding part by a question mark (?), containing a query string of non-hierarchical data. Its syntax is not well defined, but by convention is most often a sequence of attribute–value pairs separated by a delimiter.

jsonapi.org correct way to use pagination using the page query string

In the documentation for jsonapi for pagination is says the following:
For example, a page-based strategy might use query parameters such as
page[number] and page[size]
How would I represent this in the query string? http://localhost:4200/people?page[number]=1&page[size]=25, I don't think using a map link structure is a valid query string. Only the page parameter is reserved according to the documentation.
I don't think using a map link structure is a valid query string.
You're right technically, and that's why the spec has the note that says:
Note: The example query parameters above use unencoded [ and ] characters simply for readability. In practice, these characters must be percent-encoded, per the requirements in RFC 3986.
So, page[size] is really page%5Bsize%5D which is a valid query parameter name.
Only the page parameter is reserved according to the documentation.
When the spec text says that only page is reserved, it actually means that any page[......] style query parameter is reserved. (I can tell you that for sure as one of the spec's editors.) But it should say so more explicitly, so I'll open an issue for it.

User info in URI without password

I know that URI supports the following syntax:
http://[user]:[password]#[domain.tld]
When there is no password or if the password is empty, is there a colon?
In other words, should I accept this:
http://[user]:#[domain.tld]
Or this:
http://[user]#[domain.tld]
Or are they both valid?
The current URI standard (STD 66) is RFC 3986, and the relevant section is 3.2.1. User Information.
There it’s defined that the userinfo subcomponent (which gets followed by #) can contain any combination of
the character :,
percent-encoded characters, and
characters from the sets unreserved and sub-delims.
So this means that both of your examples are valid.
However, note that the format user:password is deprecated. Anyway, they give recommendations how applications should handle such URIs, i.e., everything after the first : character should not be displayed by applications, unless
the data after the colon is the empty string (indicating no password).
So according to this recommendation, the userinfo subcomponent user: indicates that there is the username "user" and no password.
This is more like convenience and both are valid. I would go with http://[user]#[domain.tld] (and prompt for a password.) because it's simple and not ambiguous. It does not give any chance for user to think if he has to add anything after :

.Net Uri Encoding RFC 2396 vs RFC 3986

First, some quick background... As part of an integration with a third party vendor, I have a C# .Net web application that receives a URL with a bunch of information in the query string. That URL is signed with an MD5 hash and a shared secret key. Basically, I pull in the query string, remove their hash, perform my own hash on the remaining query string, and make sure mine matches the one that was supplied.
I'm retrieving the Uri in the following way...
Uri uriFromVendor = new Uri(Request.Url.ToString());
string queryFromVendor = uriFromVendor.Query.Substring(1); //Substring to remove question mark
My issue is stemming from query strings that contain special characters like an umlaut (ü). The vendor is calculating their hash based on the RFC 2396 representation which is %FC. My C# .Net app is calculating it's hash based on the RFC 3986 representation which is %C3%BC. Needless to say, our hashes don't match, and I throw my errors.
Strangely, the documentation for the Uri class in .Net says that it should follow RFC 2396 unless otherwise set to RFC 3986, but I don't have the entry in my web.config file that they say is required for this behavior.
How can I force the Uri constructor to use the RFC 2396 convention?
Failing that, is there an easy way to convert the RFC 3986 octet pairs to RFC 2396 octets?
Nothing to do with your question, but why are you creating a new Uri here? You can just do string queryFromVendor = Request.Url.Query.Substring(1); – atticae
+1 for atticae! I went back to try removing the extraneous Uri I was creating and suddenly, the string had the umlaut encoded as UTF-8 instead of UTF-16.
At first, I didn't think this would work. Somewhere along the line, I had tried retrieving the url using Request.QueryString, but this was causing the umlaut to come through as %ufffd which is the � character. In the interest of taking a fresh perspective, I tried atticae's suggestion and it worked.
I'm pretty sure the answer has to do with something I read here.
C# uses UTF-16 in all its strings, with tools to encode when it comes to dealing with streams and files that bring us onto...
ASP.NET uses UTF-8 by default, and it's hard to think of a time when it isn't a good choice...
My problems stemmed from here...
Uri uriFromVendor = new Uri(Request.Url.ToString());
By taking the Request.Url uri and creating another uri, it was encoding as the C# standard UTF-16. By using the original uri, it remained in the .Net standard UTF-8.
Thanks to all for your help.
I'm wondering if this is a bit of a red herring:
I say this because FC is the UTF16 representation of the u with umlaut; C2BC is the UTF8 representation.
I wonder if one of the System.Text.Encoding methods to convert the source data into a normal .Net string might help.
This question might be of interest too: Encode and Decode rfc2396 URLs
I don't know about the standard encoding for Uri constructors, but if everything else fails you could always decode the URL yourself and encode it in whatever encoding you like.
The HttpUtility-Class has an UrlDecode() and UrlEncode() method, which lets you specify the System.Text.Encoding as second parameter.
For example:
string decodedQueryString = HttpUtility.UrlDecode(Request.Url.Query.Substring(1));
string encodedQueryString = HttpUtility.UrlEncode(decodedQueryString, System.Text.Encoding.GetEncoding("utf-16"));
// calc hash here

Query string: Can a query string contain a URL that also contains query strings?

Example:
http://foo.com/generatepdf.aspx?u=http://foo.com/somepage.aspx?color=blue&size=15
I added the iis tag because I am guessing it also depends on what server technology you use?
The server technology shouldn't make a difference.
When you pass a value to a query string you need to url encode the name/value pair. If you want to pass in a value that contains a special character such as a question mark (?) you'll just need to encode that character as %3F. If you then needed to recursively pass another query string to the encoded url, you'll need to double/triple/etc encode the url resulting in the original ? turning into %253F, %25253F, etc.
you'll probably want to UrlEncode the url that is in the query string.
As reported in http://en.wikipedia.org/wiki/Query_string
W3C recommends that all web servers support semicolon separators in
addition to ampersand separators (link reported on that wiki page) to allow
application/x-www-form-urlencoded query strings in URLs within HTML
documents without having to entity escape ampersands.
So, I suppose the answer to the question is yes and you have to change in a ";" semicolon the "&" ampersand usaully used for key=value separator.
Yes it can, as far as I can tell, according to RFC 3986: Uniform Resource Identifier (URI): Generic Syntax (from year 2005):
This is the BNF for the query string:
query = *( pchar / "/" / "?" )
pchar = unreserved / pct-encoded / sub-delims / ":" / "#"
The spec says:
The characters slash ("/") and question mark ("?") may represent data within the query component.
as query components are often used to carry identifying information in the form of "key=value" pairs and one frequently used value is a reference to another URI, it is sometimes better for usability to avoid percent-encoding those characters
(But I suppose your server framework might or might not follow the specification exactly.)
No, but you can encode the url and decode it later.

Resources