When loading a page containing references to non-existing URL's using tags, I see a strange text reply from the server.
The server does return status code 404 as expected, but it also includes a text response with chinese-like characters.
Is the server infected, or is the error message just in a different language than english ... ?
If infected, how can I find out where?
Here is an example of the text replies. It seems that the reply is identical for each 404 error, although a part of it changes when a different filetype is not found.
㰡䑏䍔奐䔠桴浬⁐啂䱉䌠∭⼯圳䌯⽄呄⁘䡔䵌‱⸰⁓瑲楣琯⽅丢•桴瑰㨯⽷睷㌮潲术呒⽸桴浬ㄯ䑔䐯硨瑭氱瑲楣琮摴搢㸍਼桴浬⁸浬湳㴢桴瑰㨯⽷睷㌮潲术ㄹ㤹⽸桴浬∾ഊ㱨敡搾ഊ㱭整愠桴瑰ⵥ煵楶㴢䍯湴敮琭呹灥∠捯湴敮琽≴數琯桴浬㬠捨慲獥琽楳漭㠸㔹ⴱ∯㸍਼瑩瑬放㐰㐠ⴠ䙩汥爠摩牥捴潲礠湯琠景畮搮㰯瑩瑬放ഊ㱳瑹汥⁴祰攽≴數琯捳猢㸍਼ℭⴍ潤祻浡牧楮㨰㭦潮琭獩穥㨮㝥活景湴ⵦ慭楬示噥牤慮愬⁁物慬Ⱐ䡥汶整楣愬慮猭獥物昻...
You have a character encoding issue. It is a ASCII file that is being interpretted as a 2byte character encoding and thus you are getting the strange characters.
To translate it I copied the text to notepad. Saved it as Unicode-Big Endian and then used a hex editor to strip the first two characters (that tell it that it is unicode). Opening it again gave me:
<!DOCTYPE html PUBLIC "-//W3C//DTD XHTML 1.0 Strict//EN" "http://www.w3.org/TR/xhtml1/DTD/xhtml1-strict.dtd"> <html xmlns="http://www.w3.org/1999/xhtml"> <head> <meta http-equiv="Content-Type" content="text/html; charset=iso-8859-1"/> <title>404 - File or directory not found.</title> <style type="text/css"> <!-- body{margin:0;font-size:.7em;font-family:Verdana, Arial, Helvetica, sans-serif;
You would need to look into things to see if it is declaring the wrong content type in an HTML header (the content type in the meta tag looks correct) or something else is causing it problems.
Related
When i use the LinkedIn API to get profile information, the picture urls are sometime not accessible.
I get this response
<?xml version="1.0" encoding="iso-8859-1"?>
<!DOCTYPE html PUBLIC "-//W3C//DTD XHTML 1.0 Transitional//EN"
"http://www.w3.org/TR/xhtml1/DTD/xhtml1-transitional.dtd">
<html xmlns="http://www.w3.org/1999/xhtml" xml:lang="en" lang="en">
<head>
<title>403 - Forbidden</title>
</head>
<body>
<h1>403 - Forbidden</h1>
</body>
</html>
Upon further digging I find that LinkedIn returns something like this for original picture url
.../profile-originalphoto-shrink_900_1200...
However, when I go to LinkedIn.com and checkout the URL of the picture of the profile it looks something like this
.../profile-originalphoto-shrink_800_800...
Does anyone else face this issue? What is going on here?
I also verified that I'm using the correct scope "r_basicprofile"
Rather than requesting the original picture-url, you can make a request to a resized picture-url. For example,
picture-url;size=400
where size can be 100, 200, or 400.
I looked at the URL of my picture of my In Public profile and the one returned from the API.
The picture URL in the XML returned after the "?" for params v and t had "& ;" instead of "&":
e.g.
https://media.licdn.com/.../profile-displayphoto-shrink_200_200/0?e=152800"&";v=beta"&";t=LJTrw_oj9npH06X1u0HjQ
replacing it with sth like pictureURL = pictureURL.replaceAll("& ;","&"); fixed the issue for me. Hope this helps
note that the there is an extra space between & and ; it would have formatted otherwise.
Rich text field value <div> </div>
output <div>?</div>
DWT:
<!DOCTYPE html PUBLIC "-//W3C//DTD HTML 4.01 Transitional//EN" "http://www.w3.org/TR/html4/loose.dtd">
<html>
##Component.Fields.Text##
</html>
what did i miss?
thanks
I can't try it out now due to lack of time, but a few things come to mind:
What's the encoding set on your publication target?
Do you get the same result in Template Builder and Preview?
If you open the published file with a text editor like Notepad++, what does it show?
EDIT
In preview I get this:
<div> </div>
After publishing I get the same as you:
<div>?</div>
So I changed my publication target to use "Unicode (UTF-8)" instead of "System Default", and now when I publish I get this:
<div>Â </div>
I then referred to Elena's excellent 7 clues to deal with encoding, and figured out I was missing this on my web.config:
<globalization fileEncoding="UTF-8" requestEncoding="UTF-8" responseEncoding="UTF-8"/>
This still didn't do it, it was still loading this weird character in between the tags. This last clue was because the encoding was not being applied to pages with the ".html" extension. Renamed my page template to have a .aspx extension, published, pressed F5 and magic, my div now shows:
<div> </div>
EDIT 2 If you want to use the .html extension, just add this to your page's <head>:
<meta http-equiv="Content-type" content="text/html;charset=UTF-8">
Caveat: Before someone goes and marks this as duplicate of this, please understand that it is not. The accepted answer is exactly what I am doing, yet I am facing the following issue.
HTML file in client folder looks like this:
<head>
<meta charset="utf-8"/>
<title>blah-blah</title>
---
The message I am getting in the firebug console is:
The character encoding declaration of the HTML document
was not found when prescanning the first 1024 bytes of
the file. When viewed in a differently-configured browser,
this page will reload automatically. The encoding
declaration needs to be moved to be within the first
1024 bytes of the file.
When I do a view source, between the head and the meta charset element, I see a whole bunch of link stylesheet and script tags.
If I remove the meta charset, I get this in the firebug console:
The character encoding of the HTML document was not
declared. The document will render with garbled text
in some browser configurations if the document
contains characters from outside the US-ASCII range.
The character encoding of the page must to be declared
in the document or in the transfer protocol.
How do I get the meta charset tag to appear right after the head?
What I did was edit /usr/lib/meteor/app/lib/app.html.in, and add the meta charset line so that the file now looks like this:
<!DOCTYPE html>
<html>
<head>
<meta charset="utf-8"/> //**Added this line**
{{#each stylesheets}} <link rel="stylesheet" href="{{this}}">
{{/each}}
...
And of course I removed the meta charset line from my html files.
I think right now, this would be the way to go and this will be resolved in future revisions.
I had the problem in IE to force to use the latest version.
I had to add
<meta http-equiv="x-ua-compatible" content="IE=edge">
Directly behind the tag. And app.html.in seems not to be used anymore.
So I did this on tools/latest/tools/bundler.js
Line 783
'<head><meta http-equiv="x-ua-compatible" content="IE=edge">\n');
That forced it to add it in the html boilerplate.
In one of my asp.net-applications I found a strange behaviour produced in Internet Explorer 9 while IE8 works well.
As the default encoding I need utf-8. That's important because I use german so called Umlaute like "ÄäÖüÜü".
When the page is loaded for the first time IE9 decides to use "Western Europe" Encoding. That's ISO 8859-1 as far as I know and the Umlaute change to strange letters.
On the second load IE9 uses utf-8 correctly.
In the sourcecode I tried the following things to tell IE which encoding to use:
<!DOCTYPE html PUBLIC "-//W3C//DTD XHTML 1.0 TRANSITIONAL//EN" "http://www.w3.org/TR/xhtml1/DTD/xhtml1-transitional.dtd">
<html xmlns="http://www.w3.org/1999/xhtml" lang="de">
<meta http-equiv="content-type" content="text/html; charset=utf-8">
Why does IE9 work so strange on the first load?
And what else can I try to tell IE9 how to work
Firstable - server where you host your site can return wrong encoding information in header;
Two - maybe it's some fail in string that tolk about encoding in the header of your page (wrong symbol in that string).
Three - open you page in Hex brouser (WinHex for example) and post first row of code (sometimes editor place wrong data in first byte, I've stumble on it once)
If this site is placed online, post it's url and I try to find a problem.
Check response header of you server it must contains something like this:
Key Value
Content-Type text/html; charset=utf-8
Response HTTP/1.1 200 OK
if it's not then check you server settings or you code there must be place where Content-type header changes
EDIT: ok, encoding is right, as suggested in the comment you shoul check first bytes of you response, it seems like it starts with additional bytes (usually info about encoding)
I intend to create asp.net pages using Visual Studio 2008. Preferably, the pages should be fully compliant with XHTML standard. How should I include the diacritics into the page content (no need to use diacritics in URLs)? Should I use character references (the ones with "&"), or just writing them directly form the keyboard?
Thank you.
You will need to ensure the correct character set encoding for the page, UTF-8 usually covers most western alphabets and UTF-16 for double byte characters required by languages that use ideograms.
In the HEAD element of the page you will need some form of the following tag;
<meta http-equiv="Content-Type" content="text/html; charset=utf-8"/>
You will also need to ensure you have the correct DOCTYPE specified;
<!DOCTYPE html PUBLIC "-//W3C//DTD XHTML 1.0 Transitional//EN"
"http://www.w3.org/TR/xhtml1/DTD/xhtml1-transitional.dtd">
This is well covered by the W3C Character Sets Tutorial