I am using IE version 8 in my web application.
I am sending string which consist of blank space character which i have encoded as "%20". When I send this string to a specific URL,it interprets "%20" as underscore sign instead of Blank space. can anybody tell me what might have gone wrong?
there cant be just a space in computer code, which is why it has an underscore () instead of a blank spaces. The unicode symbol of an underscore or a "" is represented by "%20"
Related
Last year I asked a question about copy the entry value in xamarin forms.
When I test it now the white spaces in the text are filled by a + symbol. Also when pasting the emojis it is not working properly.
I am using Xamarin.Plugins.Clipboard NuGet package for copy the text to the clipboard. When copy text to clipboard I am using the following code:
CrossClipboard.Current.SetText(message);
When long press in the device it shows the paste option. I am using that option for pasting the copied text.
Please suggest a solution for avoiding the + symbol? Also for emoji copy paste.
Thanks in advance.
Problem should need to WebUtility.UrlDecode(String) the text :
Converts a string that has been encoded for transmission in a URL into a decoded string.
If characters such as blanks and punctuation are passed in an HTTP stream, they might be misinterpreted at the receiving end. URL encoding converts characters that are not allowed in a URL into equivalent hexadecimal escape sequences. The UrlEncode method creates a URL-encoded string.
URL decoding replaces hexadecimal escape sequences with corresponding ASCII character equivalents. For example, when embedded in a block of URL-encoded text, the escape sequences %3c and %3e are decoded into the characters < and >.
Sample as follow:
using System.Net;
Console.WriteLine("Encode:" + WebUtility.UrlEncode("😂"));
// out ==> %F0%9F%98%82
Console.WriteLine("Decode:" + WebUtility.UrlDecode("%F0%9F%98%82"));
// out ==> 😂
Console.WriteLine("Encode:" + WebUtility.UrlEncode("this is a text message"));
// out ==> this+is+a+text+message
Console.WriteLine("Decode:" + WebUtility.UrlDecode("this+is+a+text+message"));
// out ==> this is a text message
Solution:
Not directly CrossClipboard.Current.SetText(message);
Try with CrossClipboard.Current.SetText( WebUtility.UrlDecode(message));
I have a URL like that: localhost:8080/demo/
And when I call localhost:8080/demo/''''''''' It working fine.
But when I try with localhost:8080/demo/;;; It not working and return HTTP code 404 Not Found.
I tried with few special character # % \ ? / , it returned 400 too.
Anyone can explain it for me?
Thank you so much!
These special characters are not directly allowed in URLs,
because they have special meanings there.
For example:
/ is separator within the path,
? marks the query-part of an URL,
# marks a page-internal link,
etc.
Quoted from Wikipedia: Percent-encoding reserved characters:
When a character from the reserved set (a "reserved character")
has special meaning (a "reserved purpose") in a certain context,
and a URI scheme says that it is necessary to use that character
for some other purpose, then the character must be percent-encoded.
Percent-encoding a reserved character involves converting the
character to its corresponding byte value in ASCII and then
representing that value as a pair of hexadecimal digits. The digits,
preceded by a percent sign (%) which is used as an escape character,
are then used in the URI in place of the reserved character.
For example: ; is a reserved character. Therefore, when ; shall occur
in an URL but without having its special meaning, then it needs to be
replaced by %3B as defined here
I am new to using the Google translate API and during testing we noticed that for some translations (I have not been able to find a pattern yet) we get \u200b characters in the response. That results in a lot of issues and above all it does not seem to server any purpose or make any sense. As simple example:
https://www.googleapis.com/language/translate/v2?key=YOURKEY&source=NL&target=EN&q=Hergeneer%20verkopen
returns:
{
"data": {
"translations": [
{
"translatedText": "Sell \u200b\u200bHerge Down"
}
]
}
}
Our software stumbles over these \u200b strings/characters and I have not found a way to prevent them or get rid of them.
Please read the documentation of the JSON format: https://json.org/
A string is a sequence of zero or more Unicode characters.
A char is either any Unicode character except " or \ or control-character,
[...]
or it is \u followed by four hex-digits.
We are in this last case, \u followed by four hex-digits, and it represents a Unicode character: Unicode Character 'ZERO WIDTH SPACE' (U+200B). It even has its own Wikipedia page: Zero-width space. And its Stack Overflow question: What's HTML character code 8203?.
Now, there are plenty Unicode characters with special behaviors, and this is one of those, an invisible one among others. So you need to be aware of how Unicode works, and you should sanitize input/output from third-parties API (and from user inputs as well).
Just define the list of characters that you actually want to support, and be sure to strip or filter out all the other ones. For instance, if you desire to support NL and EN, then you could strip what is outside the Latin script in Unicode.
Stripping the U+200B that you're encountering and other undesirable characters may save you from potential surprises like with:
big characters ⎲⎳
zalgo characters C̨̦̺̩̲̥͉̭͚̜̻̝̣̼͙̮̯̪o̴̡͇̘͎̞̲͇̦̲͞͡m̸̩̺̝̣̹̱͚̬̥̫̳̼̞̘̯͘ͅẹ͇̺̜́̕͢
invisible characters
emojis 👨👩👧👦#️⃣🏳️🌈
For example, in Unix, a backslash (\) is a common escape character. So to escape a full stop (.) in a regular expression, one does this:
\.
But with % encoding URL parameters, we have an escape character, %, and a control code, so an ampersand (&) doesn't become:
%&
Instead, it becomes:
%26
Any reason why? Seems to just make things more complicated, on the face of it, when we could just have one escape character and a mechanism to escape itself where necessary:
%%
Then it'd be:
simpler to remember; we just need to know which characters to escape, not which to escape and what to escape them to
encoding-agnostic, as we wouldn't be sending an ASCII or Unicode representation explicitly, we'd just be sending them in the encoding the rest of the URL is going in
easy to write an encoder: s/[!\*'();:#&=+$,/?#\[\] "%-\.<>\\^_`{|}~]/%&/g (untested!)
better because we could switch to using \ as an escape character, and life would be simpler and it'd be summer all year long
I might be getting carried away now. Someone shoot me down? :)
EDIT: replaced two uses of "delimiter" with "escape character".
Percent encoding happens not only to escape delimiters, but also so that you can transport bytes that are not allowed inside URIs (such as control characters or non-ASCII characters).
I guess it's because the URL Specification and specifically the HTTP part of it, only allow certain characters so to escape those one must replace them with characters that are allowed.
Also some allowed characters have special meanings like & and ? etc
so replacing them with a control code seems the only way to solve it
If you find it hard to recognize them, bookmark this page
http://www.w3schools.com/tags/ref_urlencode.asp
I have a restful webservice which receives some structured data which is put straight into a database.
The data is send from an OS using wget. I am just wondering whether I actually need to URL encode the data and if so why? Please note that it is no problem to do it but it might be uneccessary in this scenario.
If your data has characters that aren't allowed in urls, you should url encode it.
The following characters are either reserved (like &) or just present the possibility of confusing code. If your data contains these characters, urlencode it. Remember if you are using any extended ascii characters, unicode characters or non-printable characters you should url-encode your data.
Dollar ("$")
Ampersand ("&")
Plus ("+")
Comma (",")
Forward slash/Virgule ("/")
Colon (":")
Semi-colon (";")
Equals ("=")
Question mark ("?")
'At' symbol ("#")
Space
Quotation marks
'Less Than' symbol ("<")
'Greater Than' symbol (">")
'Pound' character ("#")
Percent character ("%")
Left Curly Brace ("{")
Right Curly Brace ("}")
Vertical Bar/Pipe ("|")
Backslash ("\")
Caret ("^")
Tilde ("~")
Left Square Bracket ("[")
Right Square Bracket ("]")
Grave Accent ("`")
More info can be found here: http://www.blooberry.com/indexdot/html/topics/urlencoding.htm