Newtonsoft.JSON error with text with HTML encoded character - asp.net

I allow user input from TinyMCE on client and store it as a JSON string, then pass it to server ASP.NET C#.
The JSON String looks like this: { "mcfn2" : ";lt;p;gt;Trước đ& oacute;, việc tung ra t& ecirc;n miền lần đầu ti& ecirc;n được sự đồng & yacute; của ICANN - tổ chức quản l& yacute; t& ecirc;n miền quốc tế" } (JSON string contains Vietnamese accent)
But when process on server, I received error "Unterminated string. Expected delimiter: ". Line 1, position ...." (It looks like the error happened because of đ& oacute;). (In this page, I seperate & with character after it by a space, because it will automacally converted to a Vietnamese if there are no space)
There are no error if user input is English text (no Vietnamese accent).
Please guide me how to fix this error.

I know at this time this will probably not be useful for you, but maybe it can help another person.
You should convert your string to UTF8 to deal with accents (Vietnamese and many other languages) before serializing it to JSON. For that you can use this function:
private string ConvertToUtf8(string textOriginal)
{
if (!string.IsNullOrEmpty(textOriginal))
{
byte[] bytes = Encoding.Default.GetBytes(textOriginal);
return Encoding.UTF8.GetString(bytes);
}
return string.Empty;
}

Related

CSV file (with special characters) upload encoding issue

I am trying to upload a CSV file that has special characters using ServletFileUpload of apache common. But the special characters present in the CSV are being stored as junk characters in the database. The special characters I have are Trademark, registered etc. Following is the code snippet.
ServletFileUpload upload = new ServletFileUpload();
FileItemIterator iter = upload.getItemIterator(request);
while (iter.hasNext()) {
FileItemStream item = iter.next();
String name = item.getFieldName();
InputStream stream = item.openStream();
if (item.isFormField()) {
System.out.println("Form field " + name + " with value "
+ Streams.asString(stream, "UTF-8") + " detected.");
}
}
I have tried reading it using BufferendReader, used request.setCharacterEncoding("UTF-8"), tried upload.setHeaderEncoding("UTF-8") and also checked with IOUtils.copy() method, but none of them worked.
Please advice how to get rid of this issue and where it needs to be addressed? Is there anything I need to do beyond servlet code?
Thanks
What database are using? What character set is database using? Characters can be malformed in the database rather than in Java code.

Qt, QUrl, QUrlQuery: Encoding special character in a query string

I create a URL query like this:
QString normalize(QString text)
{
text.replace("%", "%25");
text.replace("#", "%40");
text.replace("‘", "%27");
text.replace("&", "%26");
text.replace("“", "%22");
text.replace("’", "%27");
text.replace(",", "%2C");
text.replace(" ", "%20");
return text;
}
QString key = "usermail";
QString value = "aemail#gmail.com";
QUrlQuery qurlqr;
qurlqr.addQueryItem(normalize(key), normalize(value));
QString result = qurlqr.toString();
The result that's be expecting is :
usermail=aemail%40gmail.com.
But I received:
usermail=aemail#gmail.com
I don't know why. Can you help me?
(I'm using Qt5 on Win7)
QUrlQuery's toString by default decodes the percent encoding. If you want the encoded version try:
qurlqr.toString(QUrl::FullyEncoded)
Also you don't need to manually encode the string by replacing characters; you could instead use QUrl::toEncoded() (I suggest you read the QUrlQuery documentation).

escaping string for json result in asp.net server side operation

I have a server side operation manually generating some json response. Within the json is a property that contains a string value.
What is the easiest way to escape the string value contained within this json result?
So this
string result = "{ \"propName\" : '" + (" *** \\\"Hello World!\\\" ***") + "' }";
would turn into
string result = "{ \"propName\" : '" + SomeJsonConverter.EscapeString(" *** \\\"Hello World!\\\" ***") + "' }";
and result in the following json
{ \"propName\" : '*** \"Hello World!\" ***' }
First of all I find the idea to implement serialization manually not good. You should to do this mostla only for studying purpose or of you have other very important reason why you can not use standard .NET classes (for example use have to use .NET 1.0-3.0 and not higher).
Now back to your code. The results which you produce currently are not in JSON format. You should place the property name and property value in double quotas:
{ "propName" : "*** \"Hello World!\" ***" }
How you can read on http://www.json.org/ the double quota in not only character which must be escaped. The backslash character also must be escaped. You cen verify you JSON results on http://www.jsonlint.com/.
If you implement deserialization also manually you should know that there are more characters which can be escaped abbitionally to \" and \\: \/, \b, \f, \n, \r, \t and \u which follows to 4 hexadecimal digits.
How I wrote at the beginning of my answer, it is better to use standard .NET classes like DataContractJsonSerializer or JavaScriptSerializer. If you have to use .NET 2.0 and not higher you can use Json.NET.
You may try something like:
string.replace(/(\\|")/g, "\\$1").replace("\n", "\\n").replace("\r", "\\r");

How to correctly uppercase Greek words in .NET?

We have ASP.NET application which runs different clients around the world. In this application we have dictionary for each language. In dictionary we have words in lowercase and sometimes we uppercase it in code for typographic reasons.
var greek= new CultureInfo("el-GR");
string grrr = "Πόλη";
string GRRR = grrr.ToUpper(greek); // "ΠΌΛΗ"
The problem is:
...if you're using capital letters
then they must appear like this: f.e.
ΠΟΛΗ and not like ΠΌΛΗ, same for all
other words written in capital letters
So is it possible generically to uppercase Greek words correctly in .NET? Or should I wrote my own custom algorithm for Greek uppercase?
How do they solve this problem in Greece?
I suspect that you're going to have to write your own method, if el-GR doesn't do what you want. Don't think you need to go to the full length of creating a custom CultureInfo, if this is all you need. Which is good, because that looks quite fiddly.
What I do suggest you do is read this Michael Kaplan blog post and anything else relevant you can find by him - he's been working on and writing about i18n and language issues for years and years and his commentary is my first point of call for any such issues on Windows.
I don't know much about ASP.Net but I know how I'd do this in Java.
If the characters are Unicode, I would just post-process the output from ToUpper with some simple substitutions, one being the conversion of \u038C (Ό) to \u039F (Ο) or \u0386 (Ά) to \u0391 (Α).
From the looks of the Greek/Coptic code page (\u0370 through \u03ff), there's only a few characters (6 or 7) you'll need to change.
Check out How do I remove diacritics (accents) from a string in .NET?
How about replacing the wrong characters with the right ones:
/// <summary>
/// Returns the string to uppercase using Greek uppercase rules.
/// </summary>
/// <param name="source">The string that will be converted to uppercase</param>
public static string ToUpperGreek(this string source)
{
Dictionary<char, char> mappings = new Dictionary<char, char>(){
{'Ά','Α'}, {'Έ','Ε'}, {'Ή','Η'}, {'Ί','Ι'}, {'Ό','Ο'}, {'Ύ','Υ'}, {'Ώ','Ω'}
};
source = source.ToUpper();
char[] result = new char[source.Length];
for (int i = 0; i < result.Length; i++)
{
result[i] = mappings.ContainsKey(source[i]) ? mappings[source[i]] : source[i];
}
return new string(result);
}

How do you remove the padding from Modifed Base 64 for URL?

This Wiki article on Base64 URL says
"For this reason, a modified Base64 for URL variant exists, where no padding '=' will be used, and the '+' and '/' characters of standard Base64 are respectively replaced by '-' and '_', so that using URL encoders/decoders is no longer necessary and has no impact on the length of the encoded value, leaving the same encoded form intact for use in relational databases, web forms, and object identifiers in general."
When I try and remove the padding using ASP.NET, I get an error when I get my query strings back. How can I account for the missing padding?
string encoded = GetBase64FromQueryString();
encoded = encoded.PadRight(NextMultiple(encoded.Length, 4), '=');
...
static int NextMultiple(int value, int multiple)
{
int r = value % multiple;
return value + (r != 0 ? multiple - r : 0);
}

Resources