I have a server side operation manually generating some json response. Within the json is a property that contains a string value.
What is the easiest way to escape the string value contained within this json result?
So this
string result = "{ \"propName\" : '" + (" *** \\\"Hello World!\\\" ***") + "' }";
would turn into
string result = "{ \"propName\" : '" + SomeJsonConverter.EscapeString(" *** \\\"Hello World!\\\" ***") + "' }";
and result in the following json
{ \"propName\" : '*** \"Hello World!\" ***' }
First of all I find the idea to implement serialization manually not good. You should to do this mostla only for studying purpose or of you have other very important reason why you can not use standard .NET classes (for example use have to use .NET 1.0-3.0 and not higher).
Now back to your code. The results which you produce currently are not in JSON format. You should place the property name and property value in double quotas:
{ "propName" : "*** \"Hello World!\" ***" }
How you can read on http://www.json.org/ the double quota in not only character which must be escaped. The backslash character also must be escaped. You cen verify you JSON results on http://www.jsonlint.com/.
If you implement deserialization also manually you should know that there are more characters which can be escaped abbitionally to \" and \\: \/, \b, \f, \n, \r, \t and \u which follows to 4 hexadecimal digits.
How I wrote at the beginning of my answer, it is better to use standard .NET classes like DataContractJsonSerializer or JavaScriptSerializer. If you have to use .NET 2.0 and not higher you can use Json.NET.
You may try something like:
string.replace(/(\\|")/g, "\\$1").replace("\n", "\\n").replace("\r", "\\r");
Related
I have defined grammar rules like
TOKEN : { < SINGLE_QUOTE : " ' " > }
TOKEN : { < STRING_LITERAL : " ' " (~["\n","\r"])* " ' ">
But I am not able to parse sequences like 're'd' .I need the parser to parse re'd as a string literal.But the parser parses 're' seperately and 'd' seperately for these rules.
If you need to lex re'd as STRING_LITERAL token then use the following rule
TOKEN : { < SINGLE_QUOTE : "'" > }
TOKEN : { < STRING_LITERAL : "'"? (~["\n","\r"])* "'"?>
I didn't see the rule for matching "re" separately.
In javacc, definition of your lexical specification STRING_LITERAL is to start with "'" single quot. But your input doesn't have the "'" at starting.
The "?" added in the STRING_LITERAL makes the single quot optional and if present only one. so this will match your input and lex as STRING_LITERAL.
JavaCC decision making rules:
1.) JavaCC will looks for the longest match.
Here in this case even if the input starts with the "'" the possible matches are SINGLE_QUOTE and STRING_LITERAL. the second input character tells which token to choose STRING_LITERAL.
2.) JavaCC takes the the rule declared first in the grammar.
Here if the input is only "'" then it will be lexed as SINGLE_QUOTE even if there is the possible two matches SINGLE_QUOTE and STRING_LITERAL.
Hope this will help you...
The following should work:
TOKEN : { < SINGLE_QUOTE : "'" > }
TOKEN : { < STRING_LITERAL : "'" (~["\n","\r"])* "'"> }
This is pretty much what you had, except that I removed some spaces.
Now if there are two on more apostrophes on a line (i.e. without an intervening newline or return) then the first and the last of those apostrophes together with all characters between should be lexed as one STRING_LITERAL token. That includes all intervening apostrophes. This is assuming there are no other rules involving apostrophes. For example, if your file is 're'd' that should lex as one token; likewise 'abc' + 'def' should lex as one token.
I've a json file with plone objects and there is one field of the objects giving me an error:
UnicodeDecodeError('ascii', '{"id":"aluminio-prata", "nome":"ALUM\xc3\x8dNIO PRATA", "num_demaos":0, "rendimento": 0.0, "unidade":"litros", "url":"", "particular":[], "profissional":[], "unidades":[]},', 36, 37, 'ordinal not in range(128)') (Also, the following error occurred while attempting to render the standard error message, please see the event log for full details: 'NoneType' object has no attribute 'getMethodAliases')
I already know witch field is, is the "title" from title = obj.pretty_title_or_id(), when I remove it from here its ok:
json += '{"id":"' + str(id) + '", "nome":"' + title + '", "num_demaos":' + str(num_demaos) + ', "rendimento": ' + str(rendimento) + ', "unidade":"' + str(unidade) + '", "url":"' + link_produto + '", "particular":' + arr_area_particular + ', "profissional":' + arr_area_profissional + ', "unidades":' + json_qtd + '},
but when I leave it I've got this error.
UnicodeDecodeError('ascii', '{"id":"aluminio-prata", "nome":"ALUM\xc3\x8dNIO PRATA", "num_demaos":0, "rendimento": 0.0, "unidade":"litros", "url":"", "particular":[], "profissional":[], "unidades":[]},', 36, 37, 'ordinal not in range(128)') (Also, the following error occurred while attempting to render the standard error message, please see the event log for full details: 'NoneType' object has no attribute 'getMethodAliases')
I'm going to assume that the error occurs when you're reading the JSON file.
Internally, Plone uses Python Unicode strings for nearly everything. If you read a string from a file, it will need to be decoded into Unicode before Plone can use it. If you give no instructions otherwise, Python will assume that the string was encoded as ASCII, and will attempt its Unicode conversion on that basis. It would be similar to writing:
unicode("ALUM\xc3\x8dNIO PRATA")
which will produce the same kind of error.
In fact, the string you're using was evidently encoded with the UTF-8 character set. That's evident from the "\xc3", and it also makes sense, because that's the character set Plone uses when it sends data to the outside world.
So, how do you fix this? You have to specify the character set that you wish to use when you convert to Unicode:
"ALUM\xc3\x8dNIO PRATA".decode('UTF8')
This gives you a Python Unicode string with no error.
So, after you've read your JSON file into a string (let's call it mystring), you will need to explicitly decode it by using mystring.decode('UTF8'). unicode(mystring, 'UTF8') is another form of the same operation.
As Steve already wrote do title.decode('utf8')
An Example illustrate the facts:
>>> u"Ä" == u"\xc4"
True # the native unicode char and escaped versions are the same
>>> "Ä" == u"\xc4"
False # the native unicode char is '\xc3\x84' in latin1
>>> "Ä".decode('utf8') == u"\xc4"
True # one can decode the string to get unicode
>>> "Ä" == "\xc4"
False # the native character and the escaped string are
# of course not equal ('\xc3\x84' != '\xc4').
I find this Thread very helpfull for Problems and Understanding with Encode/Decode of UTF-8.
I am using
string strurl = "Reports/ReportFilter.aspx";
and bind a tag as
AnchorLeftMenuLinks.Append(" href='javascript:OpenDialogue(" + strurl + ");' ");
but it return error as "undefined object AuditReports" as runtime it become like
href="javascript:OpenDialogue(Reports/ReportFilter.aspx);"
but when i add single quotes manually in firebug like
href="javascript:OpenDialogue('Reports/ReportFilter.aspx');"
it works fine.
can anyone suggest me that how to add single quotes in code.Yhankx in advance.
Try this
AnchorLeftMenuLinks.Append(" href='javascript:OpenDialogue(\"" + strurl + "\");' ");
Try:
var javascript = string.Format("href='javascript:OpenDialouge('{0}');'", strurl);
AnchorLeftMenuLinks.Append(javascript);
or:
AnchorLeftMenuLinks.AppendFormat("href='javascript:OpenDialouge('{0}');'", strurl);
Reason behind it was Javascript String because In JavaScript, a string is started and stopped with either single or double quotes. This means that the string was being chopped to: javascript:OpenDialogue( and your function's syntax was being incorrect and thus it was not working.
Thus it was mandatory to place a backslash (\)before each double quote in strurl. This turns each double quote into a string literal.
There are some other special characters also which needed to be placed using \
\' - single quote
\" - Double Quote
\\ - BackSlash
\n - new Line
\t - tab
I allow user input from TinyMCE on client and store it as a JSON string, then pass it to server ASP.NET C#.
The JSON String looks like this: { "mcfn2" : ";lt;p;gt;Trước đ& oacute;, việc tung ra t& ecirc;n miền lần đầu ti& ecirc;n được sự đồng & yacute; của ICANN - tổ chức quản l& yacute; t& ecirc;n miền quốc tế" } (JSON string contains Vietnamese accent)
But when process on server, I received error "Unterminated string. Expected delimiter: ". Line 1, position ...." (It looks like the error happened because of đ& oacute;). (In this page, I seperate & with character after it by a space, because it will automacally converted to a Vietnamese if there are no space)
There are no error if user input is English text (no Vietnamese accent).
Please guide me how to fix this error.
I know at this time this will probably not be useful for you, but maybe it can help another person.
You should convert your string to UTF8 to deal with accents (Vietnamese and many other languages) before serializing it to JSON. For that you can use this function:
private string ConvertToUtf8(string textOriginal)
{
if (!string.IsNullOrEmpty(textOriginal))
{
byte[] bytes = Encoding.Default.GetBytes(textOriginal);
return Encoding.UTF8.GetString(bytes);
}
return string.Empty;
}
The URL link below will open a new Google mail window. The problem I have is that Google replaces all the plus (+) signs in the email body with blank space. It looks like it only happens with the + sign. How can I remedy this? (I am working on a ASP.NET web page.)
https://mail.google.com/mail?view=cm&tf=0&to=someemail#somedomain.com&su=some subject&body=Hi there+Hello there
(In the body email, "Hi there+Hello there" will show up as "Hi there Hello there")
The + character has a special meaning in [the query segment of] a URL => it means whitespace: . If you want to use the literal + sign there, you need to URL encode it to %2b:
body=Hi+there%2bHello+there
Here's an example of how you could properly generate URLs in .NET:
var uriBuilder = new UriBuilder("https://mail.google.com/mail");
var values = HttpUtility.ParseQueryString(string.Empty);
values["view"] = "cm";
values["tf"] = "0";
values["to"] = "someemail#somedomain.com";
values["su"] = "some subject";
values["body"] = "Hi there+Hello there";
uriBuilder.Query = values.ToString();
Console.WriteLine(uriBuilder.ToString());
The result:
https://mail.google.com:443/mail?view=cm&tf=0&to=someemail%40somedomain.com&su=some+subject&body=Hi+there%2bHello+there
If you want a plus + symbol in the body you have to encode it as 2B.
For example:
Try this
In order to encode a + value using JavaScript, you can use the encodeURIComponent function.
Example:
var url = "+11";
var encoded_url = encodeURIComponent(url);
console.log(encoded_url)
It's safer to always percent-encode all characters except those defined as "unreserved" in RFC-3986.
unreserved = ALPHA / DIGIT / "-" / "." / "_" / "~"
So, percent-encode the plus character and other special characters.
The problem that you are having with pluses is because, according to RFC-1866 (HTML 2.0 specification), paragraph 8.2.1. subparagraph 1., "The form field names and values are escaped: space characters are replaced by `+', and then reserved characters are escaped"). This way of encoding form data is also given in later HTML specifications, look for relevant paragraphs about application/x-www-form-urlencoded.
Just to add this to the list:
Uri.EscapeUriString("Hi there+Hello there") // Hi%20there+Hello%20there
Uri.EscapeDataString("Hi there+Hello there") // Hi%20there%2BHello%20there
See https://stackoverflow.com/a/34189188/98491
Usually you want to use EscapeDataString which does it right.
Generally if you use .NET API's - new Uri("someproto:with+plus").LocalPath or AbsolutePath will keep plus character in URL. (Same "someproto:with+plus" string)
but Uri.EscapeDataString("with+plus") will escape plus character and will produce "with%2Bplus".
Just to be consistent I would recommend to always escape plus character to "%2B" and use it everywhere - then no need to guess who thinks and what about your plus character.
I'm not sure why from escaped character '+' decoding would produce space character ' ' - but apparently it's the issue with some of components.