Why is HttpUtility.UrlPathEncode marked as "do not use"? - asp.net

Why does the documentation of .NET for HttpUtility.UrlPathEncode for .NET 4.5 state
Do not use; intended only for browser compatibility. Use UrlEncode.
UrlEncode does not do the same, it encodes a string for the parameter part of a URL, not for the path part. Is there a better way to encode a string for the path part and why shouldn't I use this function, which is in the framework since 1.1 and works?

based on MSDN, they recommend to use UrlEncode to grantee that it is working for all platforms and browsers
You can encode a URL using with the UrlEncode method or the UrlPathEncode method. However, the methods return different results. The UrlEncode method converts each space character to a plus character (+). The UrlPathEncode method converts each space character into the string "%20", which represents a space in hexadecimal notation. Use the UrlPathEncode method when you encode the path portion of a URL in order to guarantee a consistent decoded URL, regardless of which platform or browser performs the decoding.
Also UrlEncode is using UTF-8 encoding so if you are sending query string in different language like Arabic you should use UrlEncode

Use Uri.EscapeUriString if you want encode a path of a URL. HttpUtility.UrlEncode is for query parameters and encode also slashes which is wrong for a path.

Related

Servlet stripping parameter values because of # character

My URL is http://175.24.2.166/download?a=TOP#0;ONE=1;TWO2.
How should I encode the parameter so that when I print the parameter in the Servlet, I get the value in its entirety? Currently when I print the value by using request.getParameter("a") I get the output as TOP instead of TOP#0;ONE=1;TWO2.
You should encode it like this http://175.24.2.166/download?a=TOP%230%3BONE%3D1%3BTWO2 . There are a lot of the encoders in Java, you can try to use URLEncoder or some online encoders for experements
This is known as the "fragment identifier".
as mentioned in wiki
The fragment identifier introduced by a hash mark # is the optional last part of a URL for a document. It is typically used to identify a portion of that document.
the part after the # is info for the client. Put everything your client needs here.
you need to encode your query string.
you can use encodeURIComponent() function in JavaScript encodes a URI component.This function encodes special characters.

Response.Redirect Ampersand Encoding

Failing to escape an "&" character in HTML markup creates an entity. It is often done inadvertently when linking URLs in a document, and W3C's Markup Validation Service will consider this an error.
I'm wondering, does ASP.NET's Response.Redirect method expect ampersands to be escaped in its url parameter? From reading its MSDN description, I honestly can't tell.
Pass the URL exactly as it should appear in the address bar in the web browser. For example, if you're trying to redirect to http://example.com/?foo=bar&baz=quux, then pass that exact string as-is to Response.Redirect.
try UrlEncode The UrlEncode(String) method can be used to encode the entire URL, including query-string values. If characters such as blanks and punctuation are passed in an HTTP stream without encoding, they might be misinterpreted at the receiving end. URL encoding converts characters that are not allowed in a URL into character-entity equivalents; URL decoding reverses the encoding. For example, when the characters < and > are embedded in a block of text to be transmitted in a URL, they are encoded as %3c and %3e. URLEncode
System.Web.HttpUtility.UrlEncode(string url)

In ASP.NET, why is there UrlEncode() AND UrlPathEncode()?

In a recent project, I had the pleasure of troubleshooting a bug that involved images not loading when spaces were in the filename. I thought "What a simple issue, I'll UrlEncode() it!" But, NAY! Simply using UrlEncode() didn't resolve the problem.
The new problem was the HttpUtilities.UrlEncode() method switched spaces () to plusses (+) instead of %20 like the browser wanted. So file+image+name.jpg would return not-found while file%20image%20name.jpg was found correctly.
Thankfully, a coworker pointed out HttpUtilities.UrlPathEncode() to me which uses %20 for spaces instead of +.
WHY are there two ways of handling Url encoding? WHY are there two commands that behave so differently?
UrlEncode is useful for use with a QueryString as browsers tend to use a + here in place of a space when submitting forms with the GET method.
UrlPathEncode simply replaces all characters that cannot be used within a URL, such as <, > and .
Both MSDN links include this quote:
You can encode a URL using with the UrlEncode method or the
UrlPathEncode method. However, the methods return different results.
The UrlEncode method converts each space character to a plus character
(+). The UrlPathEncode method converts each space character into the
string "%20", which represents a space in hexadecimal notation. Use
the UrlPathEncode method when you encode the path portion of a URL in
order to guarantee a consistent decoded URL, regardless of which
platform or browser performs the decoding.
So in a URL you have the path and then a ? and then the parameters (i.e. http://some_path/page.aspx?parameters). URL paths encode spaces differently then the url parameters, that's why there is the two versions. For a long time spaces were not valid in a URL, but were in in the parameters.
In other words the formatting urls has changed over time. For a long time only ANSI chars could be in a URL too.

.Net Uri Encoding RFC 2396 vs RFC 3986

First, some quick background... As part of an integration with a third party vendor, I have a C# .Net web application that receives a URL with a bunch of information in the query string. That URL is signed with an MD5 hash and a shared secret key. Basically, I pull in the query string, remove their hash, perform my own hash on the remaining query string, and make sure mine matches the one that was supplied.
I'm retrieving the Uri in the following way...
Uri uriFromVendor = new Uri(Request.Url.ToString());
string queryFromVendor = uriFromVendor.Query.Substring(1); //Substring to remove question mark
My issue is stemming from query strings that contain special characters like an umlaut (ü). The vendor is calculating their hash based on the RFC 2396 representation which is %FC. My C# .Net app is calculating it's hash based on the RFC 3986 representation which is %C3%BC. Needless to say, our hashes don't match, and I throw my errors.
Strangely, the documentation for the Uri class in .Net says that it should follow RFC 2396 unless otherwise set to RFC 3986, but I don't have the entry in my web.config file that they say is required for this behavior.
How can I force the Uri constructor to use the RFC 2396 convention?
Failing that, is there an easy way to convert the RFC 3986 octet pairs to RFC 2396 octets?
Nothing to do with your question, but why are you creating a new Uri here? You can just do string queryFromVendor = Request.Url.Query.Substring(1); – atticae
+1 for atticae! I went back to try removing the extraneous Uri I was creating and suddenly, the string had the umlaut encoded as UTF-8 instead of UTF-16.
At first, I didn't think this would work. Somewhere along the line, I had tried retrieving the url using Request.QueryString, but this was causing the umlaut to come through as %ufffd which is the � character. In the interest of taking a fresh perspective, I tried atticae's suggestion and it worked.
I'm pretty sure the answer has to do with something I read here.
C# uses UTF-16 in all its strings, with tools to encode when it comes to dealing with streams and files that bring us onto...
ASP.NET uses UTF-8 by default, and it's hard to think of a time when it isn't a good choice...
My problems stemmed from here...
Uri uriFromVendor = new Uri(Request.Url.ToString());
By taking the Request.Url uri and creating another uri, it was encoding as the C# standard UTF-16. By using the original uri, it remained in the .Net standard UTF-8.
Thanks to all for your help.
I'm wondering if this is a bit of a red herring:
I say this because FC is the UTF16 representation of the u with umlaut; C2BC is the UTF8 representation.
I wonder if one of the System.Text.Encoding methods to convert the source data into a normal .Net string might help.
This question might be of interest too: Encode and Decode rfc2396 URLs
I don't know about the standard encoding for Uri constructors, but if everything else fails you could always decode the URL yourself and encode it in whatever encoding you like.
The HttpUtility-Class has an UrlDecode() and UrlEncode() method, which lets you specify the System.Text.Encoding as second parameter.
For example:
string decodedQueryString = HttpUtility.UrlDecode(Request.Url.Query.Substring(1));
string encodedQueryString = HttpUtility.UrlEncode(decodedQueryString, System.Text.Encoding.GetEncoding("utf-16"));
// calc hash here

ASP.NET Base64 string corruption

I am passing an object from one asp.net page to another. I'm encoding the object as a Base64 string and passing it as a POST parameter. However, when the receiving page reads the POST value, if there is a + sign in the Base64 string, it is being replaced with a line break. For example:
...AABDEDS+DFEAED...
becomes
...AABDEDS
DFEAED...
I compared the Base64 string immediately after encoding in the sending page to the string immediately before decoding in the receiving page and that is the only difference. I tried HtmlEncoding() the base64 string prior to writing it to the request stream, but that had no effect, so it seems to be an issue on the receiving end.
Any ideas?
Use UrlEncode. The + is a reserved character and needs to be encoded.
When you pass the base64 string in the parameter, you need to URL Encode it (so the characters come across properly). Use:
System.Web.HttpServerUtility.UrlEncode(base64String);
HttpServer.UrlEncode Method (String)(System.Web)
the + symbol is a special URL character that on it's own evaluates to a space in the URL.
You'll need to Server.URLEncode your base64 string on one side (which will turn the Plus into a %2B and Server.URLDecode it on the other side

Resources