Servlet stripping parameter values because of # character - servlets

My URL is http://175.24.2.166/download?a=TOP#0;ONE=1;TWO2.
How should I encode the parameter so that when I print the parameter in the Servlet, I get the value in its entirety? Currently when I print the value by using request.getParameter("a") I get the output as TOP instead of TOP#0;ONE=1;TWO2.

You should encode it like this http://175.24.2.166/download?a=TOP%230%3BONE%3D1%3BTWO2 . There are a lot of the encoders in Java, you can try to use URLEncoder or some online encoders for experements

This is known as the "fragment identifier".
as mentioned in wiki
The fragment identifier introduced by a hash mark # is the optional last part of a URL for a document. It is typically used to identify a portion of that document.
the part after the # is info for the client. Put everything your client needs here.
you need to encode your query string.
you can use encodeURIComponent() function in JavaScript encodes a URI component.This function encodes special characters.

Related

Response Parsed Body in Paw. How to identify the Key-Path in a long url?

I have a Paw related question.
Does anybody know how to extract a value from an encoded URL response field with Paw? The value is the only part of the encoded URL which starts with a %3D (the URL encoded version of an = sign).
Getting the dynamic values out of JSON, a JSON array, a URL, etc worked great.
You can use our RegExp Match dynamic value for this: https://luckymarmot.com/paw/extensions/RegExMatch
insert the RegExp Match dynamic value first
as input for RegExp Match use the Response Parsed Body dynamic value (with the key path to the url-encoded field with the id)
write the regular expression to extract the id from the field (see example in the screenshot)
Excellent point Natalia. Instead of the Regex extension I used the Substring extension. This worked perfectly as the size of the encoded URL never changed.

jsonapi.org correct way to use pagination using the page query string

In the documentation for jsonapi for pagination is says the following:
For example, a page-based strategy might use query parameters such as
page[number] and page[size]
How would I represent this in the query string? http://localhost:4200/people?page[number]=1&page[size]=25, I don't think using a map link structure is a valid query string. Only the page parameter is reserved according to the documentation.
I don't think using a map link structure is a valid query string.
You're right technically, and that's why the spec has the note that says:
Note: The example query parameters above use unencoded [ and ] characters simply for readability. In practice, these characters must be percent-encoded, per the requirements in RFC 3986.
So, page[size] is really page%5Bsize%5D which is a valid query parameter name.
Only the page parameter is reserved according to the documentation.
When the spec text says that only page is reserved, it actually means that any page[......] style query parameter is reserved. (I can tell you that for sure as one of the spec's editors.) But it should say so more explicitly, so I'll open an issue for it.

Why is HttpUtility.UrlPathEncode marked as "do not use"?

Why does the documentation of .NET for HttpUtility.UrlPathEncode for .NET 4.5 state
Do not use; intended only for browser compatibility. Use UrlEncode.
UrlEncode does not do the same, it encodes a string for the parameter part of a URL, not for the path part. Is there a better way to encode a string for the path part and why shouldn't I use this function, which is in the framework since 1.1 and works?
based on MSDN, they recommend to use UrlEncode to grantee that it is working for all platforms and browsers
You can encode a URL using with the UrlEncode method or the UrlPathEncode method. However, the methods return different results. The UrlEncode method converts each space character to a plus character (+). The UrlPathEncode method converts each space character into the string "%20", which represents a space in hexadecimal notation. Use the UrlPathEncode method when you encode the path portion of a URL in order to guarantee a consistent decoded URL, regardless of which platform or browser performs the decoding.
Also UrlEncode is using UTF-8 encoding so if you are sending query string in different language like Arabic you should use UrlEncode
Use Uri.EscapeUriString if you want encode a path of a URL. HttpUtility.UrlEncode is for query parameters and encode also slashes which is wrong for a path.

String Functions in IIS Url Rewrite Module

The IIS URL Rewrite Module ships with 3 built-in functions:
* ToLower - returns the input string converted to lower case.
* UrlEncode - returns the input string converted to URL-encoded format. This function can be used if the substitution URL in rewrite rule contains special characters (for example non-ASCII or URI-unsafe characters).
* UrlDecode - decodes the URL-encoded input string. This function can be used to decode a condition input before matching it against a pattern.
The functions can be invoked by using the following syntax:
{function_name:any_string}
The question is: can this list be extended by introducing a Replace function that's available for changing values within a rewrite rule action or condition?
Another way to frame the question: is there any way to do a global replace on a URL coming in using this module?
It seems that you're limited to using regular expressions and back-references to construct strings - i.e. there's no search/replace mechanism to replace every instance of X with Y in {REQUEST_URI}, without knowing how many instances there are.
I've had a quick glance at the extensibility introduced in the 2.0 RTW and don't see any 'light' means of introducing this.
Looks like you have to implement your own provider as shown here:
http://learn.iis.net/page.aspx/804/developing-a-custom-rewrite-provider-for-url-rewrite-module/

when assigning location.href, please explain url encoding (in asp.net and firefox)

In some javascript, I have:
var url = "find.aspx?" + "location=" + encodeURIComponent( address );
alert( url );
location.href = url;
where the value of address is the string "Seattle, WA".
In the alert I see
find.aspx?Seattle%2C%20WA
as I expect.
But on the server side, when I look at Request.Url, the relevant substring I see is
find.aspx?Seattle, WA
And in the Firefox url window I see
find.aspx?location=Seattle%2C WA
So I'm getting three different representations whereas I would expect that in all three places I should see what I see in the alert. My expectation is that the url I assign to location.href should show up as-is in the browser url window, and should be passed as-is to the server in Request.Url (and I would need to decode the values on the server before using them). What's happening?
Firefox converts certain encoded characters into their literal forms as a way to be friendly to users. It will also convert spaces typed into the address bar into %20 for the server.
Update: The reason Firefox doesn't display the comma unencoded is because commas are allowed in URLs, but spaces are not, so it knows that a space is going to be unambiguously interpreted, whereas the pre-encoded comma is different from a non-encoded comma to some servers. see: Can I use commas in a URL?
ASP is probably trying to help you out by auto-un-encoding the string for you.
Update: It looks like ASP.NET unencodes Request.Url for you by default, as mentioned here: QueryString malformed after URLDecode They also mention that you can use HttpRequest.Url.Query to access the un-decoded version.
The alert is the only thing not doing any "magic" for you.
For the alert, you are doing the encoding yourself. Perhaps it looks the same as on the server-side if you removed encodeURIComponent.
On the server side, ASP.NET will always show you the unencoded form. This is to make it easier to directly map to files that also have text that needed to be (un)encoded.
Note that you can replace every letter for its UTF8 representation in URL Encoding. It will still be the same URL. I.e., type the following in the browser window and it will still work: %66%59%6E%64.aspx?location=Seattle%2C%20WA. To only encode the necessary chars, use UrlEncode on the server side if you create a link yourself.
URL encoding can become fairly tricky. You ask to explain it. To know the correct escape of a certain character, you need to know how that character looks in UTF8. The hexadecimal value of the UTF-8 bytes then become the %XX%YY value of your letter. Sometimes it's one %XX, but it can be up to six byte sequences in total (some Chinese characters for instance).
URL Encoding works one way only. Never double-encode or double-unencode. This is prohibited by the specification. Also, because you can encode any character, it is not always possible (as you found out) to do roundtrip encoding/unencoding. If you unencode and re-encode again, it is well possible that the resulting string is different, but syntactically the same.
In HTML, URL Encoding is sometimes interspersed with HTML Encoding. I.e., the ampersand is valid in HTML, but not in HTML. find.aspx?city=A&name=B becomes find.aspx?city=A&name=B in and HTML URL. However, browsers are lenient and will accept wrongly HTML-encoded strings.
Finally, a not on the browser: if you type in a space in a link, even inside an <a> tag, it will escape the space (or other character) for you. Likewise, it will nowadays show the odd characters (é, ï etc) in the address bar, but when it sends it over HTTP, the browser will correctly do the encoding for you.
Update: about anwering your question of needing a "definitive" reference or proof.
While I couldn't find any on the internet, I decided to look for it myself using Reflector. Going through the methods that set, for instance, the HttpRequest.QueryString, you quickly encounter the private method HttpRequest.FillInQueryStringCollection which then calls HttpValueCollection.FillfromEncodedBytes. Somewhat near the end of that method, HttpUtility.UrlDecode is called for the values. Conclusion: do not call it yourself, to prevent double decoding.
You can see this for yourself when you download Reflector and disassemble the .NET libs of System.Web.
For your example you can change this line
var url = "find.aspx?" + "location=" + encodeURIComponent( address );
to
var url = "find.aspx?" + "location=" + address;
and see the address as it is. Bu if address variable contains any '&' character your variable will be corrupt. So you are using encodeURIComponent to encode these things url.
On the Server side all these encoded strings are decoded back. It means encodeURIComponent is just for sending the address variable (whether it contains & character or not) to server side correctly.

Resources