How to avoid Java request.getQueryString() get escaped string [duplicate] - servlets

This question already has answers here:
request.getQueryString() seems to need some encoding
(5 answers)
Closed 2 years ago.
I have a Servlet that process a QueryString from a JSP form.
When the JSP send the information to the Servlet, it puts on the URL the query, like this:
https://www.mywebsite.com/search?query=Barcelona
And i receive the data inside the Servlet in this way:
String query = request.getQueryString();
// query will receive "Barcelona"
Everything works alright. The problem comes when the string contains characters from other encodings like russian, chinese, arabic, etc. So, if I put the URL like:
https://www.mywebsite.com/search?query=Медеуский
I will receive in the query variable the following text:
String query = request.getQueryString();
// query will receive " %D0%9C%D0%B5%D0%B4%D0%B5%D1%83%D1%81%D0%BA%D0%B8%D0%B9"
So, how could I avoid the escaped conversion?
Thanks!

You can use URLDecoder.decode(request.getQueryString(), "UTF-8")
or
String unicodeQuery = new String(query.getBytes(), StandardCharsets.UTF_8);
String unicodeQuery = new String(query.getBytes("ISO_8859_1"), StandardCharsets.UTF_8);
String unicodeQuery = new String(query.getBytes(Charset.defaultCharset()), StandardCharsets.UTF_8);

Related

Accessing the query string value using ASP.NET

I have been trying to find the question to my answer but I'm unable to and finally I'm here. What I want to do is access the value passed to a webpage (GET, POST request) using asp.net. To be more clear, for example:
URL: http://www.foobar.com/SaleVoucher.aspx?sr=34
Using asp.net I want to get the sr value i.e 34.
I'm from the background of C# and new to ASP.NET and don't know much about ASP.NET.
Thanx.
Can you refer to this QueryString
Here he says how to access the query string using:
Request.Url.Query
That is not called a Header, but the Query String.
the object document.location.search will contain that and the javascript to get any query string value based on the key would be something like:
function getParameterByName(name) {
name = name.replace(/[\[]/, "\\\[").replace(/[\]]/, "\\\]");
var regex = new RegExp("[\\?&]" + name + "=([^&#]*)"),
results = regex.exec(location.search);
return results == null ? "" : decodeURIComponent(results[1].replace(/\+/g, " "));
}
code from other question: https://stackoverflow.com/a/901144/28004

asp .net query string encoding and decoding

I type the following url into my web browser and press enter.
http://localhost/website.aspx?paymentID=6++7d6CZRKY%3D&language=English
Now in my code when I do HttpContext.Current.Request.QueryString["paymentID"],
I get 6 7d6CZRKY=
but when I do HttpContext.Current.Request.QueryString.ToString() I see the following:
paymentID=6++7d6CZRKY%3D&language=English
The thing I want to extract the actual payment id that the user typed in the web browser URL. I am not worried as to whether the url is encoded or not. Because I know there is a weird thing going on here %3D and + sign at the same time ! But I do need the actual + sign. Somehow it gets decoded to space when I do HttpContext.Current.Request.QueryString["paymentID"].
I just want to extract the actual payment ID that the user typed. What's the best way to do it?
Thank you.
You'll need to encode the URL first, using URLEncode(). + in URL equals a space so needs to be encoded to %2b.
string paymentId = Server.UrlEncode("6++7d6CZRKY=");
// paymentId = 6%2b%2b7d6CZRKY%3d
And now
string result = Request.QueryString["paymentId"].ToString();
//result = 6++7d6CZRKY=
However
string paymentId = Server.UrlEncode("6 7d6CZRKY=");
//paymentId looks like you want it, but the + is a space -- 6++7d6CZRKY%3d
string result = Request.QueryString["paymentId"].ToString();
//result = 6 7d6CZRKY=
There is some info on this here: Plus sign in query string.
But I suppose you could also use a regular expression to get your parameter out of the query string. Something like this:
string queryString = HttpContext.Current.Request.QueryString.ToString();
string paramPaymentID = Regex.Match(queryString, "paymentID=([^&]+)").Groups[1].Value;
I sent an Arabic text in my query string
and when I resieved this string it was Encoded
after Server.UrlDecode
departmentName = Server.UrlDecode(departmentName);
it back again to arabic
I hope this help you

Input string not in Correct format error while using the Google currency Conversion API

My application is in Asp.Net MVC3 coded in C#.Net.
Im using google Currency Conversion API for getting the conversion.
Google Currency Conversion API
Im passing the From Currency and To Currency values and the amount to my Controller.
Below is my C# code.
WebClient web = new WebClient();
string url = string.Format("http://www.google.com/ig/calculator?hl=en&q={2}{0}%3D%3F{1}", CurrenctCurr.ToUpper(), DestinationCurr.ToUpper(), TotCurrentPay);
string response = web.DownloadString(url);
Regex regex = new Regex("rhs: \\\"(\\d*.\\d*)");
Match match = regex.Match(response);
decimal rate = Convert.ToDecimal(match.Groups[1].Value);
Im getting an error on line
decimal rate = Convert.ToDecimal(match.Groups[1].Value.ToString());
The error is Input string not in Correct format.There is no inner exception to it.
I tried it to convert to toString()
decimal rate = Convert.ToDecimal(match.Groups[1].Value.ToString());
Also i tried to take the value of match in a string variable and then try doing a SubString logic on it.But its not working either
string test = match.ToString();
test=test.substring(0,test.Indexof("""));
There is a single Quotation ("),so im trying to take the value of string till ".
Suggest how can i solve the issue.Or what else can i try in the right direction.
Try this for getting url.
Uri uri = new Uri(string.Format("http://www.google.com/ig/calculator?hl=en&q={2}{0}%3D%3F{1}", CurrenctCurr.ToUpper(), DestinationCurr.ToUpper(), TotCurrentPay));
string url = uri.AbsoluteUri + uri.Fragment;
string response = web.DownloadString(url);

Character + is converted to %2B in HTTP Post

I'm adding functionality to a GM script we use here at work, but when trying to post (cross site may I add) to another page, my posting value of CMD is different than what it is on the page.
It's supposed to be Access+My+Account+Info but the value that is posted becomes Access%2BMy%2BAccount%2BInfo.
So I guess my question is: What's escaping my value and how do I make it not escape? And if there's no way to unescape it, does anyone have any ideas of a workaround?
Thanks!
%2B is the code for a +. You (or whatever framework you're using) should already be decoding the POST data server-side...
Just a quick remark: If you want to decode a path segment, you can use UriUtils (spring framework):
#Test
public void decodeUriPathSegment() {
String pathSegment = "some_text%2B"; // encoded path segment
String decodedText = UriUtils.decode(pathSegment, "UTF-8");
System.out.println(decodedText);
assertEquals("some_text+", decodedText);
}
Uri path segments are different from HTML escape chars (see list). Here is an example:
#Test
public void decodeHTMLEscape() {
String someString = "some_text+";
String stringJsoup = org.jsoup.parser.Parser.unescapeEntities(someString, false);
String stringApacheCommons = StringEscapeUtils.unescapeHtml4(someString);
String stringSpring = htmlUnescape(someString);
assertEquals("some_text+", stringJsoup);
assertEquals("some_text+", stringApacheCommons);
assertEquals("some_text+", stringSpring);
}
/data/v50.0/query?q=SELECT Id from Case
This worked for me. Give space instead of '+'

ISO-8859-1 to UTF8 in ASP.NET 2

We've got a page which posts data to our ASP.NET app in ISO-8859-1
<head>
<META HTTP-EQUIV="Content-Type" CONTENT="text/html; charset=iso-8859-1">
<title>`Sample Search Invoker`</title>
</head>
<body>
<form name="advancedform" method="post" action="SearchResults.aspx">
<input class="field" name="SearchTextBox" type="text" />
<input class="button" name="search" type="submit" value="Search >" />
</form>
and in the code behind (SearchResults.aspx.cs)
System.Collections.Specialized.NameValueCollection postedValues = Request.Form;
String nextKey;
for (int i = 0; i < postedValues.AllKeys.Length; i++)
{
nextKey = postedValues.AllKeys[i];
if (nextKey.Substring(0, 2) != "__")
{
// Get basic search text
if (nextKey.EndsWith(XAEConstants.CONTROL_SearchTextBox))
{
// Get search text value
String sSentSearchText = postedValues[i];
System.Text.Encoding iso88591 = System.Text.Encoding.GetEncoding("iso-8859-1");
System.Text.Encoding utf8 = System.Text.Encoding.UTF8;
byte[] abInput = iso88591.GetBytes(sSentSearchText);
sSentSearchText = utf8.GetString(System.Text.Encoding.Convert(iso88591, utf8, abInput));
this.SearchText = sSentSearchText.Replace('<', ' ').Replace('>',' ');
this.PreviousSearchText.Value = this.SearchText;
}
}
}
When we pass through Merkblätter it gets pulled out of postedValues[i] as Merkbl�tter
The raw string string is Merkbl%ufffdtter
Any ideas?
You have this line of code:-
String sSentSearchText = postedValues[i];
The decoding of octets in the post has happen here.
The problem is that META http-equiv doesn't tell the server about the encoding.
You could just add RequestEncoding="ISO-8859-1" to the #Page directive and stop trying to fiddle around with the decoding yourself (since its already happened).
That doesn't help either. It seems you can only specify the Request encoding in the web.config.
Better would be to stop using ISO-8859-1 altogether and leave it with the default UTF-8 encoding. I can see no gain and only pain with using a restrictive encoding.
Edit
If it seems that changing the posting forms encoding is not a possibility then we seem to be left with no alternative than to handle the decoding ourselves. To that end include these two static methods in your receiving code-behind:-
private static NameValueCollection GetEncodedForm(System.IO.Stream stream, Encoding encoding)
{
System.IO.StreamReader reader = new System.IO.StreamReader(stream, Encoding.ASCII);
return GetEncodedForm(reader.ReadToEnd(), encoding);
}
private static NameValueCollection GetEncodedForm(string urlEncoded, Encoding encoding)
{
NameValueCollection form = new NameValueCollection();
string[] pairs = urlEncoded.Split("&".ToCharArray(), StringSplitOptions.RemoveEmptyEntries);
foreach (string pair in pairs)
{
string[] pairItems = pair.Split("=".ToCharArray(), 2, StringSplitOptions.RemoveEmptyEntries);
string name = HttpUtility.UrlDecode(pairItems[0], encoding);
string value = (pairItems.Length > 1) ? HttpUtility.UrlDecode(pairItems[1], encoding) : null;
form.Add(name, value);
}
return form;
}
Now instead of assigning:-
postedValues = Request.Form;
use:-
postValues = GetEncodedForm(Request.InputStream, Encoding.GetEncoding("ISO-8859-1"));
You can now remove the encoding marlarky from the rest of the code.
I think adding your encoding into web.config like that will probably solve your problem :
<configuration>
<system.web>
<globalization
fileEncoding="iso-8859-1"
requestEncoding="iso-8859-1"
responseEncoding="iso-8859-1"
culture="en-US"
uiCulture="en-US"
/>
</system.web>
</configuration>
We had the same problem that you have. The topic is not straight-forward at all.
The first tip is to set the Response encoding of the page that posts the data (usually the same page as the one that receives the data in .NET) to the desired form post encoding.
However, this is just a hint to the user's browser on how to interpret the characters sent from the server. The user might choose to override the encoding manually. And, if the user overrides the encoding of the page, the encoding of the data sent in the form is also changed (to whatever the user has set the encoding to).
There is a small trick, though. If you add a hidden field with the name _charset_ (notice the underscores) in your form, most browsers will fill out this form field with the name of the charset used when posting the form. This form field is also a part of the HTML5 specification.
So, you might think your're good to go, however, when in your page, ASP.NET has already urldecoded all parameters sent in to the form. So when you actually have the value in the _charset_ field, the value of the field containing Merkblätter is already decoded incorrectly by .NET.
You have two options:
In the ASP.NET page in question, perform the parsing of the request string manually
In Application_BeginRequest, in Global.asax, parse the request parameters manually, extracting the _charset_field. When you get the value, set Request.ContentEncoding to System.Text.Encoding.GetEncoding(<value of _charset_ field>). If you do this, you can read the value of the field containing Merkblätter as usual, no matter what charset the client sends the value in.
In either of the cases above, you need to manually read Request.InputStream, to fetch the form data. I would recommend setting the Response Encoding to UTF-8 to have the greatest number of options in which characters you accept, and then treating the special cases when the user has overridden the charset especially, as specified above.
Function urlDecode(input)
inp = Replace(input,"/","%2F")
set conn = Server.CreateObject("MSXML2.ServerXMLHTTP")
conn.setOption(2) = SXH_SERVER_CERT_IGNORE_ALL_SERVER_ERRORS
conn.open "GET", "http://www.neoturk.net/urldecode.asp?url=" & inp, False
conn.send ""
urlDecode = conn.ResponseText
End Function
To speed this up, just create a table on your db for decoded and encoded urls and read them on global.asa application.on_start section. Later put them on the application object.
Then put a check procedure for that application obj. in above function and IF decoded url not exists on app array, THEN request it one time from remote page (tip: urldecode.asp should be on different server see: http://support.microsoft.com/default.aspx?scid=kb;en-us;Q316451) and insert it to your db and append to application array object, ELSE return the function from the application obj.
This is the best method I have ever found.
If anybody wants further details on application object, database operations etc. contact me via admin#neoturk.net
You can see above method successfully working at: lastiktestleri.com/Home
I also used, HeliconTech's ISAPI_Rewrite Lite version
usage is simple: url = Request.ServerVariables("HTTP_X_REWRITE_URL")
this will return the exact url directed to /404.asp
That's because you are encoding the string as ISO-8859-1 and decoding it as if it was a string encoded as UTF-8. This will surely mess up the data.
The form isn't posting the data as ISO-8859-1 just because you send the page using that encoding. You haven't specified any encoding for the form data, so the browser will choose an encoding that is capable of handling the data in the form. It may choose ISO-8859-1, but it may just as well choose some other encoding.
The data is send to the server, where it's decoded and put in the Request.Form collection, according to the encoding that the browser specifies.
All you have to do is to read the string that has already been decoded from the Request.Form collection. You don't have to loop through all the items in the collection either, as you already know the name of the text box.
Just do:
string sentSearchText = Request.Form("SearchTextBox");
What I ended up doing was forcing our app to be in ISO-8859-1. Unfortunately the underlying data may contain characters which don't fit nicely into that codepage so we go through the data before displaying it and convert everything about the character code of 127 into an entity. Not ideal but works for us...
I had the same problem, solved like this:
System.Text.Encoding iso_8859_2 = System.Text.Encoding.GetEncoding("ISO-8859-2");
System.Text.Encoding utf_8 = System.Text.Encoding.UTF8;
NameValueCollection n = HttpUtility.ParseQueryString("RT=A+v%E1s%E1rl%F3+nem+enged%E9lyezte+a+tranzakci%F3t", iso_8859_2);
Response.Write(n["RT"]);
A+v%E1s%E1rl%F3+nem+enged%E9lyezte+a+tranzakci%F3t will return "A vásárló nem engedélyezte a tranzakciót" as expected.

Resources