Cookie without a name? - http

What is going on with the following cookie:
"=value"
In Chrome and Firefox this is identical to:
"value"
i.e. the value for empty cookie name becomes a cookie name.
Is there any official reason for this behavior?

It looks like a bug, since rfc says:
If the name string is empty, ignore the set-cookie-string entirely.

The cookie RFC standards are a bit vague and contradictory in places, and have also changed behaviour over various revisions. Consequently, the browsers also have varying behaviour as far as the requirements for cookies. So in short, for some browsers an empty cookie name is fine, for others not. If this is an app you're building (that you want to work across the various browsers) then you'd be probably safest setting a cookie name.
https://www.rfc-editor.org/rfc/rfc6265#section-5.2
5. If the name string is empty, ignore the set-cookie-string
entirely.
https://datatracker.ietf.org/doc/html/draft-ietf-httpbis-rfc6265bis-05#section-5.3
2. If the name-value-pair string lacks a %x3D ("=") character, then
the name string is empty, and the value string is the value of
name-value-pair.
Otherwise, the name string consists of the characters up to, but
not including, the first %x3D ("=") character, and the (possibly
empty) value string consists of the characters after the first
%x3D ("=") character.

I stumbled upon the same question today.
To clarify the answer of #buffoonism ...
https://stackoverflow.com/a/72250741/2323764
The set-cookie header must be ignored.

Related

Is this a valid URL ?& empty parameter

is something like
www.example.com?&hello=world
valid can you have ?& on valid URL, I tested on firefox, chrome, and safari, they seem to just remove the & between the ? and the h
thanks
The part between ? and # or the end of the URI is called “query”. According to the specification there are no restrictions on how the part actually may look like. So yes, it is a valid URI. The key=value format is just one possible (albeit the most common one) format for that place.
As for how servers or clients parsing that URL of yours behave, they will just ignore that empty section. So no, it’s not a problem, and you won’t run into issues because of that extraneous ampersand.

Empty URI query string parameters: "a=&b=" versus "a&b"

Should the following URLs be considered functionally equivalent?
http://example.com/foo?a=&b=
http://example.com/foo?a&b
This came about when a user of a Drupal module I wrote which parses apart and then rewrites URIs noticed that the code sometimes causes the query string parts to change in unexpected ways due to how some of the underlying PHP functions behave. For example:
parse_str("a&b", $values); print http_build_query($values);
a=&b=
Is this something I should bother worrying about?
Edit so SO stops complaining that this question is similar to another one: The question is whether it's safe to assume that "no value for X" and "empty value for X" are equivalent, not whether the "no value" style is syntactically correct (which it is).
RFC 3986 Uniform Resource Identifier (URI): Generic Syntax doesn't have anything to say about the structure of the query string aside from how characters like ? should be dealt with. So strictly speaking, your two example URLs are different. Of course, the application which receives those query strings may treat them as functionally equivalent, but this isn't something you can determine from the URL alone.
As per RFC6570 empty query parameters are allowed. Please refer to section 3.2.9
Example Template Expansion
{&x,y,empty} &x=1024&y=768&empty=

Authoritative position of duplicate HTTP GET query keys

I am having trouble on finding authoritative information about the behavior with HTTP GET query string duplicate fields, like
http://example.com/page?field=foo&field=bar
and in particular if the order is kept or not. Most web-oriented languages produce an array containing both foo and bar associated to a key "field", but I would like to know if authoritative statement exist (e.g. on a RFC) about this point. RFC 3986 has a section 3.4. Query, which refers to key=value pairs, but nothing is said on how to interpret order and duplicate fields and so on. This makes sense, since it's backend dependent, and not in the scope of that RFC...
Although a de-facto standard exists, I'd like to see an authoritative source for it, just out of curiosity.
There is no spec on this. You may do what you like.
Typical approaches include: first-given, last-given, array-of-all, string-join-with-comma-of-all.
Suppose the raw request is:
GET /blog/posts?tag=ruby&tag=rails HTTP/1.1
Host: example.com
Then there are various options for what request.query['tag'] should yield, depending on the language or the framework:
request.query['tag'] => 'ruby'
request.query['tag'] => 'rails'
request.query['tag'] => ['ruby', 'rails']
request.query['tag'] => 'ruby,rails'
The situation seems to have changed since this question was asked and the accepted answer was written 12 years ago. I believe we now have an authoritative source: The WHATWG URL Standard describes the process of extracting and parsing a query string in detail in section 6.2 (https://url.spec.whatwg.org/#interface-urlsearchparams) and section 5.1 on x-www-form-urlencoded parsing (https://url.spec.whatwg.org/#urlencoded-parsing). The parsing output is "an initially empty list of name-value tuples where both name and value hold a string", where a list is defined as a finite ordered sequence, and the key-value pairs are added to this list in the order they appear in the URL. At first there is no mention of repeated keys, but some methods on the URLSearchParams class in section 6.2 (https://url.spec.whatwg.org/#interface-urlsearchparams) set clear expectations on ordering: "The getAll(name) method steps are to return the values of all name-value pairs whose name is name... in list order"; The sort() method specifies that "The relative order between name-value pairs with equal names must be preserved." (Emphasis mine). Examining the Github issue referenced in the commit where the sort method was added, we see that the original proposal was to sort on values where keys were identical, but this was changed: "The reason for the default sort not affecting the value order is that ordering of the values can be significant. We should not assume that it's ok to move the order of the values around." (https://github.com/whatwg/url/issues/26#issuecomment-271600764)
I can confirm that for PHP (at least in version 4.4.4 and newer) it works like this:
GET /blog/posts?tag=ruby&tag=rails HTTP/1.1
Host: example.com
results in:
request.query['tag'] => 'rails'
But
GET /blog/posts?tag[]=ruby&tag[]=rails HTTP/1.1
Host: example.com
results in:
request.query['tag'] => ['ruby', 'rails']
This behavior is the same for GET and POST data.
yfeldblum's answer is perfect.
Just a note about a fifth behavior I noticed recently: on Windows Phone, opening an application with an uri with a duplicate query key will result in NavigationFailed with:
System.ArgumentException: An item with the same key had already been added.
The culprit is System.Windows.Navigation.UriParsingHelper.InternalUriParseQueryStringToDictionary(Uri uri, Boolean decodeResults).
So the system won't even let you handle it the way you want, it will forbid it. You are left with the only solution to choose your own format (CSV, JSON, XML, ...) and uri-escape-it.
Most (all?) of the frameworks offer no guarantees, so assume they will be returned in random order.
Always take the safest approach.
For example, java HttpServlet interface:
ServletRequest.html#getParameterValues
Even the getParameterMap method leaves out any mention about parameter order (the order of a java.util.Map iterator cannot be relied on either.)
Typically, duplicate parameter values like
http://example.com/page?field=foo&field=bar
result in a single queryString parameter that is an array:
field[0]=='foo'
field[1]=='bar'
I've seen this behavior in ASP, ASP.NET and PHP4.
The ?array[]=value1&array[]=value2 approach is certainly a very popular one.
supported by most Javascript frameworks
supported by Java Spring
supported by PHP

when assigning location.href, please explain url encoding (in asp.net and firefox)

In some javascript, I have:
var url = "find.aspx?" + "location=" + encodeURIComponent( address );
alert( url );
location.href = url;
where the value of address is the string "Seattle, WA".
In the alert I see
find.aspx?Seattle%2C%20WA
as I expect.
But on the server side, when I look at Request.Url, the relevant substring I see is
find.aspx?Seattle, WA
And in the Firefox url window I see
find.aspx?location=Seattle%2C WA
So I'm getting three different representations whereas I would expect that in all three places I should see what I see in the alert. My expectation is that the url I assign to location.href should show up as-is in the browser url window, and should be passed as-is to the server in Request.Url (and I would need to decode the values on the server before using them). What's happening?
Firefox converts certain encoded characters into their literal forms as a way to be friendly to users. It will also convert spaces typed into the address bar into %20 for the server.
Update: The reason Firefox doesn't display the comma unencoded is because commas are allowed in URLs, but spaces are not, so it knows that a space is going to be unambiguously interpreted, whereas the pre-encoded comma is different from a non-encoded comma to some servers. see: Can I use commas in a URL?
ASP is probably trying to help you out by auto-un-encoding the string for you.
Update: It looks like ASP.NET unencodes Request.Url for you by default, as mentioned here: QueryString malformed after URLDecode They also mention that you can use HttpRequest.Url.Query to access the un-decoded version.
The alert is the only thing not doing any "magic" for you.
For the alert, you are doing the encoding yourself. Perhaps it looks the same as on the server-side if you removed encodeURIComponent.
On the server side, ASP.NET will always show you the unencoded form. This is to make it easier to directly map to files that also have text that needed to be (un)encoded.
Note that you can replace every letter for its UTF8 representation in URL Encoding. It will still be the same URL. I.e., type the following in the browser window and it will still work: %66%59%6E%64.aspx?location=Seattle%2C%20WA. To only encode the necessary chars, use UrlEncode on the server side if you create a link yourself.
URL encoding can become fairly tricky. You ask to explain it. To know the correct escape of a certain character, you need to know how that character looks in UTF8. The hexadecimal value of the UTF-8 bytes then become the %XX%YY value of your letter. Sometimes it's one %XX, but it can be up to six byte sequences in total (some Chinese characters for instance).
URL Encoding works one way only. Never double-encode or double-unencode. This is prohibited by the specification. Also, because you can encode any character, it is not always possible (as you found out) to do roundtrip encoding/unencoding. If you unencode and re-encode again, it is well possible that the resulting string is different, but syntactically the same.
In HTML, URL Encoding is sometimes interspersed with HTML Encoding. I.e., the ampersand is valid in HTML, but not in HTML. find.aspx?city=A&name=B becomes find.aspx?city=A&name=B in and HTML URL. However, browsers are lenient and will accept wrongly HTML-encoded strings.
Finally, a not on the browser: if you type in a space in a link, even inside an <a> tag, it will escape the space (or other character) for you. Likewise, it will nowadays show the odd characters (é, ï etc) in the address bar, but when it sends it over HTTP, the browser will correctly do the encoding for you.
Update: about anwering your question of needing a "definitive" reference or proof.
While I couldn't find any on the internet, I decided to look for it myself using Reflector. Going through the methods that set, for instance, the HttpRequest.QueryString, you quickly encounter the private method HttpRequest.FillInQueryStringCollection which then calls HttpValueCollection.FillfromEncodedBytes. Somewhat near the end of that method, HttpUtility.UrlDecode is called for the values. Conclusion: do not call it yourself, to prevent double decoding.
You can see this for yourself when you download Reflector and disassemble the .NET libs of System.Web.
For your example you can change this line
var url = "find.aspx?" + "location=" + encodeURIComponent( address );
to
var url = "find.aspx?" + "location=" + address;
and see the address as it is. Bu if address variable contains any '&' character your variable will be corrupt. So you are using encodeURIComponent to encode these things url.
On the Server side all these encoded strings are decoded back. It means encodeURIComponent is just for sending the address variable (whether it contains & character or not) to server side correctly.

Why do cookie values with whitespace arrive at the client side with quotes?

I'm a .NET developer starting to dabble in Java.
In .NET, I can set the value of a cookie to a string with white space in it:
new HttpCookie("myCookieName", "my value") - and when I read that value on the client side (JavaScript), I get the value I expected (my value).
If I do the same thing in a Java servlet - new Cookie("myCookieName", "my value"), I get the value including the double quotes ("my value").
Why the difference? Am I missing something? How do people handle this in the Java world? Do you encode the value and then you decode on the client side?
When you set a cookie value with one of the following values as mentioned in Cookie#setValue(),
With Version 0 cookies, values should not contain white space, brackets, parentheses, equals signs, commas, double quotes, slashes, question marks, at signs, colons, and semicolons. Empty values may not behave the same way on all browsers.
then the average container will implicitly set the cookie to version 1 (RFC 2109 spec) instead of the default version 0 (Netscape spec). The behaviour is not specified by the Servlet API, the container is free to implement it (it may for example throw some IllegalArgumentException). As far as I know, Tomcat, JBoss AS and Glassfish behave all the same with regard to implicitly changing the cookie version. For at least Tomcat and JBoss AS this is the consequence of fixes for this security issue.
A version 1 cookie look like this:
name="value with spaces";Max-Age=3600;Path=/;Version=1
while a version 0 compatible cookie look like this:
name=value%20with%20spaces;Expires=Mon, 29-Aug-2011 14:30:00 GMT;Path=/
(note that an URL-encoded value is valid for version 0)
Important note is that Microsoft Internet Explorer doesn't support version 1 cookies. Even not the current IE 11 release. It'll interpret the quotes being part of the whole cookie value and will treat and return that accordingly. It does not support the Max-Age attribute and it'll ignore it altogether which causes that the cookie's lifetime defaults to the browser session. You was apparently using IE to test the cookie handling of your webapp.
To support MSIE as well, you really need to URL-encode and URL-decode the cookie value yourself if it contains possibly characters which are invalid for version 0.
Cookie cookie = new Cookie(name, URLEncoder.encode(value, "UTF-8"));
// ...
and
String value = URLDecoder.decode(cookie.getValue(), "UTF-8"));
// ...
In order to support version 1 cookies for the worldwide audience, you'll really wait for Microsoft to fix the lack of MSIE support and that the browser with the fix has become mainstream. In other words, it'll take ages (update: as of now, 5+ years later, it doesn't seem to ever going to happen). In the meanwhile you'd best stick to version 0 compatible cookies.
As far as I know, spaces must be encoded in cookies. Different browsers react differently to un-encoded cookies. You should URL-encode your cookie before setting it.
String cookieval = "my value";
String cookieenc = URLEncoder.encode(cookieval, "UTF-8");
res.addCookie(new Cookie("myCookieName", cookieenc));
ASP.NET does the encoding automatically, in Java you have to do it yourself. I suspect the quotes you see are added by the user agent.
It probably has to do with the way Java encodes the cookie. I suggest you try calling setVersion(1) on the new cookie and see if that works for you.
Try using setVersion(0).
HttpCookie cookie = new HttpCookie("name", "multi word value");
System.out.println(cookie.toString());
prints:
name="several word value"
But after setting
cookie.setVersion(0);
System.out.println(cookie.toString());
prints:
name=several word value
Encoding is a good idea too, but the quotes around the value look to be an independent issue.

Resources