URL-Encoded Angle Brackets in URL? - http

I'm working on a legacy app and for whatever reason it's trying to stuff URL-encoded angle brackets into a URL. For example, to get a URL ending with "<sometext>":
http://somesite.com/somefolder/%3csometext%3e
When the above URL-encoded URL is fetched, it generates a 400 error (Bad Request) on IIS6 and I can't quite figure out why. Probably something simple, but I'm stumped.
Ideas? Thanks.

You must have URLScan tool installed (http://technet.microsoft.com/en-us/security/cc242650.aspx) which disallows angle brackets (in any form).
According to this,
The new default urlscan.ini contains a rule in it to protect against these sort of patterns and the rule is just simply:
[DenyQueryStringSequences]
<
>

Related

ASP.NET Core URL Parameter Decoding

I have an ASP.NET Core web API and an issue with encoded URL's in query parameters.
I have an URL parameter like 'path/to/'. The IDENTIFIER part is something like 'HÄÄ/20/19'. This is urlEncoded in frontend to a link URL. The result is a link like
domain.com/new/stuff/path/to/H%C3%84%C3%84%2F20%2F19
Now, at some point, user gets redirected to a controller where this URL is used in a query parameter like:
param=%2Fpath%2Fto%2FH%C3%84%C3%84%2F20%2F19
I'm using request query to get the param
var param = HttpContext.Request.Query["param"].ToString();
After this the value of param is
%2Fpath%2Fto%2FHÄÄ%2F20%2F19
So the LATIN CAPITAL LETTER A WITH DIAERESIS are automatically decoded as the other encoded characters are not.
The actual problem comes when I'm redirecting the user to this URL. It ends up with a referer header where it causes havoc with an error message
System.InvalidOperationException: Invalid non-ASCII or control character in header: 0x00C4
I tried to just replace all the 'Ä' characters with 'A' and the problem is fixed. This is not a real fix though. I cannot encode the whole variable (see above) as it would result in double encoding for other encoded characters.
This problem only occurs with IE11 and Edge (AFAIK) and works fine with at least Chrome.
I'm not 100% sure where the actual problem is and why this is happening so does anyone have any ideas where to start looking and how to fix this without hacking with the string.replace?
EDIT
I could fix it with something like this, but I'm not seriously doing this. Seems way too hacky.
var problemPart = param.Substring(param.LastIndexOf('/') + 1, param.Length - param.LastIndexOf('/') - 1);
var fixedPart = WebUtility.UrlDecode(problemPart);
fixedPart = WebUtility.UrlEncode(fixedPart);
param = param.Replace(problemPart, fixedPart);
EDIT 2
I think the problem is that IE11 and Edge change the encoding by adding control characters to it when the URL ends up to the referer header. The fix I added to the original post doesn't actually fix the problem but just work around it. The control character that gets added to the URL is %C2%84 (so Ä becomes %C3%84%C2%84 instead of just %C3%84)
TEMPORARY WORKAROUND
I basically used the code above to workaround the issue. I iterated the parameter value and re-encoded all the invalid characters in it. This doesn't fix the root cause but works around the issue and user doesn't get any errors to the screen.

How to remove the "?" character from g-wan URIs

I have checked cache.c <- totally clueless what it is doing or how to have pretty permalinks to servlet calls.
Update: OK, I know what the above does, but the problem is you have to call the above script first before you can access it as permalink. Is there any way I can access permalinks without using "?" at all (in the first place)?
I have also checked on this link: Anatomy of G-WAN URI servlets
I would like to have http://example.com:8080/servlet/arg1/arg2, without "?", and would like the above link to reference "servlet" to servlet.c.
Basically, like this pretty URL for this question
https://stackoverflow.com/questions/27084626/how-to-remove-in-g-wan-url-completely
See...no "?" within the URL.
Is this possible?
I have also checked
u8 *query_char = (u8*)get_env(argv, QUERY_CHAR);
*query_char = '!'; // use "/!hello.c" instead of "/?hello.c"
I know I can't do
*query_char = '';
you can re-write url with handler there is a simple rewrite example

Nesting HTTP GET parameters (request within a request)

I want to call a JSP with GET parameters within the GET parameter of a parent JSP. The URL for this would be http://server/getMap.jsp?lat=30&lon=-90&name=http://server/getName.jsp?lat1=30&lon1=-90
getName.jsp will return a string that goes in the name parameter of getMap.jsp.
I think the problem here is that &lon1=-90 at the end of the URL will be given to getMap.jsp instead of getName.jsp. Is there a way to distinguish which GET parameter goes to which URL?
One idea I had was to encode the second URL (e.g. = -> %3D and & -> %26) but that didn't work out well. My best idea so far is to allow only one parameter in the second URL, comma-delimited. So I'll have http://server/getMap.jsp?lat=30&lon=-90&name=http://server/getName.jsp?params=30,-90 and leave it up to getName.jsp to parse its variables. This way I leave the & alone.
NOTE - I know I can approach this problem from a completely different angle and avoid nested URLs altogether, but I still wonder (for the sake of knowledge!) if this is possible or if anyone has done it...
This has been done a lot, especially with ad serving technologies and URL redirects
But an encoded URL should just work fine. You need to completely encode it tho. A generator can be found here
So this:
http://server/getMap.jsp?lat=30&lon=-90&name=http://server/getName.jsp?lat1=30&lon1=-90
becomes this: http://server/getMap.jsp?lat=30&lon=-90&name=http%3A%2F%2Fserver%2FgetName.jsp%3Flat1%3D30%26lon1%3D-90
I am sure that jsp has a function for this. Look for "urlencode". Your JSP will see the contents of the GET-Variable "name" as the unencoded string: "http://server/getName.jsp?lat1=30&lon1=-90"

Is IIS performing an illegal character substitution? If so, how to stop it?

Context: ASP.NET MVC running in IIS, with a a UTF-8 %-encoded URL.
Using the standard project template, and a test-action in HomeController like:
public ActionResult Test(string id)
{
return Content(id, "text/plain");
}
This works fine for most %-encoded UTF-8 routes, such as:
http://mydevserver/Home/Test/%e4%ba%ac%e9%83%bd%e5%bc%81
with the expected result 京都弁
However using the route:
http://mydevserver/Home/Test/%ee%93%bb
the url is not received correctly.
Aside: %ee%93%bb is %-encoded code-point 0xE4FB; basic-multilingual-plane, private-use area; but ultimately - a valid unicode code-point; you can verify this manually, or via:
string value = ((char) 0xE4FB).ToString();
string encoded = HttpUtility.UrlEncode(value); // %ee%93%bb
Now, what happens next depends on the web-server; on the Visual Studio Development Server (aka cassini), the correct id is received - a string of length one, containing code-point 0xE4FB.
If, however, I do this in IIS or IIS Express, I get a different id, specifically "î“»", code-points: 0xEE, 0x201C, 0xBB. You will immediately recognise the first and last as the start and end of our percent-encoded string... so what happened in the middle?
Well:
code-point 0x93 is “ (source)
code-point 0x201c is “ (source)
It looks to me very much like IIS has performed some kind of quote-translation when processing my url. Now maybe this might have uses in a few scenarios (I don't know), but it is certainly a bad thing when it happens in the middle of a %-encoded UTF-8 block.
Note that HttpContext.Current.Request.Raw also shows this translation has occurred, so this does not look like an MVC bug; note also Darin's comment, highlighting that it works differently in the path vs query portion of the url.
So (two-parter):
is my analysis missing some important subtlety of unicode / url processing?
how do I fix it? (i.e. make it so that I receive the expected character)
id = Encoding.UTF8.GetString(Encoding.Default.GetBytes(id));
This will give you your original id.
IIS uses Default (ANSI) encoding for path characters. Your url encoded string is decoded using that and that is why you're getting a weird thing back.
To get the original id you can convert it back to bytes and get the string using utf8 encoding.
See Unicode and ISAPI Filters
ISAPI Filter is an ANSI API - all values you can get/set using the API
must be ANSI. Yes, I know this is shocking; after all, it is 2006 and
everything nowadays are in Unicode... but remember that this API
originated more than a decade ago when barely anything was 32bit, much
less Unicode. Also, remember that the HTTP protocol which ISAPI
directly manipulates is in ANSI and not Unicode.
EDIT: Since you mentioned that it works with most other characters so I'm assuming that IIS has some sort of encoding detection mechanism which is failing in this case. As a workaround though you can prefix your id with this char and then you can easily detect if the problem occurred (if this char is missing). Not a very ideal solution but it will work. You can then write your custom model binder and a wrapper class in ASP.NET MVC to make your consumption code cleaner.
Once Upon A Time, URLs themselves were not in UTF-8. They were in the ANSI code page. This facilitates the fact that they often are used to select, well, pathnames in the server's file system. In ancient times, IE had an option to tell whether you wanted to send UTF-8 URLs or not.
Perhaps buried in the bowels of the IIS config there is a place to specify the URL encoding, and perhaps not.
Ultimately, to get around this, I had to use request.ServerVariables["HTTP_URL"] and some manual parsing, with a bunch of error-handling fallbacks (additionally compensating for some related glitches in Uri). Not great, but only affects a tiny minority of awkward requests.

Ampersands in URLRewriter Query Strings

I have a query string parameter value that contains an ampersand. For example, a valid value for the parameter may be:
a & b
When I generate the URL that contains the parameter, I'm using System.Web.HTTPUtility.UrlEncode() to make each element URL-friendly. It's (correctly) giving me a URL like:
http://example.com/foo?bar=a+%26b
The problem is that ASP.NET's Request object is interpreting the (encoded) ampersand as a Query String parameter delimiter, and is thus splitting my value into 2 parts (the first has "bar" as the parameter name; the second has a null name).
It appears that ASP.NET is URL-decoding the URL first and then using that when parsing the query string.
What's the best way to work around this?
UPDATE: The problem hinges on URLRewriter (a third-party plugin) and not ASP.NET itself. I've changed the title to reflect this, but I'll leave the rest of the question text as-is until I find out more about the problem.
man,
i am with you in the same boat, i have spent like hours and hours trying to figure out what is the problem, and as you said it is a bug in both, as normal links that contain weird characters or UTF-8 code characters are parsed fine by asp.net.
i think we have to switch to MVC.routing
Update: man you wont believe it, i have found the problem it is so strange, it is with IIS,
try to launch your page from visual studio Dev server and Unicode characters will be parsed just fine, but if you launch the page from IIS 7 it will give you the ???? characters.
hope some body will shade some light here
I would have thought that %26 and '&' mean exactly the same thing to the web server, so its the expected behavior. Urlencode is for encoding URLs, not encoding query strings.
... hang on ...
Try searching for abc&def in google, you'll get:
http://www.google.com.au/search?q=abc%26def
So your query string is correct, %26 is a literal ampersand. Hmm you're right, sounds like a bug. How do you go with an & instead of the %26 ?
Interesting reading:
http://www.stylusstudio.com/xsllist/200104/post11060.html
Switching to UrlRewritingNet.UrlRewrite did not help, as it apparently has the same bug. I'm thinking it might have something to do with ASP.NET after all.
I think URLRewriter has a problem with nameless parameters (null name).
I had a similar problem. When I gave my nameless parameter a (dummy) name, everything worked as expected.

Resources