QtWebKit not rendering Japanese (Shift_JIS charset) - qt

I have an HTML file which I want to load in a QWebView. The header looks something like:
<head>
<meta http-equiv="Content-Type" content="text/html; charset=Shift_JIS">
</head>
The body text is mixed Latin and Japanese characters.
The page displays perfectly in Chrome, but all of the Japanese characters are replaced with □ when the page is displayed in a QWebView.
QtWebKit seems to use the same system as used by QTextCodec to handle conversions between unicode and other charsets (please correct me if I'm wrong on this) and I'm therefore working on the assumption that QtWebKit can support Shift_JIS.
As a test, I've tried adding the specific unicode for a kanji character (e.g. ぁ to display ぁ) to my HTML file. I get the character properly rendered in Chrome, but it also displays as □ in a QWebView - I'm not sure whether this means I can trust the Shift_JIS to unicode conversion in Qt, but it certainly means I can't assume that it is the cause of the problem.
I'm not sure where to go from here; any suggestions as to solutions or other areas to investigate would be much appreciated.

Turns out I've been over-thinking this one, there is in fact a pretty simple solution:
When confronted with Kanji characters which the current font is unable to display, Chrome is clever enough to fall back to a font which does support those characters (on my Win 7 PC the default Kanji font is MS Gothic).
QtWebKit does not have this feature, and hence it is necessary to explicitly specify (in CSS) a Kanji-capable font for the areas which need it.

Related

Adjust CSS to make OSX Chrome Print Emoji

I cannot get Chrome on OSX to print emoji, is there any css trick or other?
Here are 2 emoji: 👍🇦🇹
When I try to print this page, the emoji space is preserved, but it's white. In Safari printing the emoji works just fine.
Here is a screenshot of the print preview of this page on Chrome:
After a lot of dialog in the question's comments, it seems you have a font rendering issue (perhaps a Chrome bug). I do not think this can be solved with any combination of HTML, CSS, or Javascript.
There is, however, the option to work around the issue by not using a font.
You can use a vector image like SVG to have the same kind of scaling capabilities as a font:
SVG for 👍THUMBS UP SIGN Unicode character
SVG for 🇦 REGIONAL INDICATOR SYMBOL LETTER A Unicode character
SVG for 🇹 REGIONAL INDICATOR SYMBOL LETTER T Unicode character
SVG for Thumbs up sign
SVG for Austrian flag
Just link to the SVG as an image and specify its dimensions either in HTML or in CSS as needed.
With a little work, you could automate conversion of user-generated emojis to images by using a dictionary of known images and supplement the misses with the either the SVG or the emoji PNG at FileFormat.Info. They have a list of emojis you could scrape (assuming it's not a violation of their terms of service) for PNGs as well as an SVG for every character (emoji or otherwise) which can be obtained from from just the character code. For example, here's U+1f44d (👍):
http://www.fileformat.info/info/unicode/char/1f44d
It'll be the only SVG link on the page, so you could do this in JS:
var svg_src = fileformat_info.querySelector('a[href$=".svg"]').href;
That said, it'd be vastly preferable to have this ready-made rather than creating from scratch. #pandawan's answer suggesting twemoji looks promising.
Perhaps there is a CSS-only workaround: I've also read elsewhere that you can properly print these characters by avoiding bold (and perhaps all font and text manipulation? perhaps just make the font size bigger?). This may depend on the fonts installed on the client system.
This is due to a rendering difference between Chrome and Safari, I would not named it a bug since I do not believe that the expect behavior is defined anywhere (Firefox has issues rendering your emojis too by the way).
If you want a full and simple emoji support across all platforms you can use twemoji, a small library developed by Twitter for this specific need.

Why on my web-page there is question mark in random place?

I generate web-page by Razor and sometimes browser show me question marks instead of one random unicode character.
For example:
I think, this question mark is displayed in place where the first byte of two-byte unicode character is in the one tcp-package and the second byte of character in the other tcp-package. But why browser does't paste them correctly?
All files encoded by utf-8. There is <meta charset="utf-8">.
Update
Question marks dependent on page content. If I change content before question mark, it may disappear or move to other place (replace other character)
Encoding the characters in UTF-8 encoding scheme is not the only thing you should consider while working with encodings. Font family also plays a great role in this thing for rendering the correct graphics for all of your characters; characters are after all just glyph drawn by graphics. Unicode takes care of all of the bytes (1, 2, 3, 4 which ever size) of your characters and shows the correct character on your screen; if your framework or font-family supports the glyph.
In your website, the font-family; probably a custom loaded, does not support this character, (or the code page to be more specific) that is why browser has to fall-back to display a question mark. You're also saying that the character is randomly chosen, so that defines the problem, for being a font-family based problem. I would advise that you try out your application in 'Segoe UI' font-family and see if that works; because that probably would work.
Apart from my suggestion, please make sure that the font-family does support that code page where this character exists. Otherwise, it will display a question mark.

render specific font bigger than other fonts

I'm searching for a method to tell the browser to render each glyph rendered with a specific font, e.g. FreeMono, in a bigger font size than glyphs rendered with other fonts. The reason for that is, that I use characters like ᚠ in a website and these glyphs are rendered using FreeMono in Chrome (see inspect element → computed → rendered fonts) and they look always like they're to small to fit the surrounding text. Is there any way I can do that?
You cannot. CSS has no tools for such font-specific tuning, apart from the font-size-adjust property, which has very limited effect, limited browser support, and buggy support.
If you use a character such as “ᚠ” U+16A0 RUNIC LETTER FEHU FEOH FE F on a web page, then it will be up to each browser in each system which font (if any) is used to render it, at least if you do not explicitly suggest some font(s) that contain it. It may be FreeMono, but most computers in the world do not have it. Besides, in FreeMono, “ᚠ” is rather large—taller than uppercase Latin letters. So if it looks too small, the reason might be a mix of fonts.
To make, say, Runic letters match the style of other text, you should try and find a font that is suitable for both—so that you can use a single font, designer by a typographer to make things fit. You would then probably need to find a suitable free font and use it as a downloadable font (with #font-face). It might be FreeSerif or FreeSans; only in very peculiar circumstances would I consider FreeMono, a monospace font, suitable for rendering computer code in some cases and mostly unsuitable for everything else.

japanese characters not rendered uniformly under IE

while i was developing a Japanese website for my client, i encountered the following:
in the screenshot, the hiragana/katakana rendering and the kanji rendering look like it's using two different fonts
in google chrome and other modern browsers:
it's not limited to japanese only, if I use Simplified Chinese(GB2312), this kind of issue happens as well; Traditional Chinese(Big5) have no such issue
the rendering in IE looks terrible, the major audiences will be using IE, how can i solve this issue?
I did not specify any font in CSS
Most probably, the rendering does use two (or more) different fonts. To fix this, specify a list of fonts for the content, selecting the fonts so that a) each of them contains all the characters you are using and b) the computers that your visitors will use contain at least one of those fonts.
As a rough starting point, check out http://en.wikipedia.org/wiki/List_of_Microsoft_Windows_fonts
The reason for different browser behavior is that browsers have different default fonts and fallback strategies (for selecting alternate fonts when the browser's default font does not contain all the characters needed).
It's also a good idea to declare the language: <html lang="ja">. It may affect font choice by a browser, so that a font suitable for Japanese is selected. But this normally does not matter when you specify fonts in your document.

CSS Text transform character

I am using text-transform: lowercase; code to lowercase my users posts. But when i use this code characters like: "i,ş,ç,ü,ğ" are become something else. How can i fix this?
The declaration text-transform: lowercase leaves lowercase letters intact. It is very unlikely that any browser has a problem with this. If you remove the CSS declaration, you will most probably see the letters already as “something else”.
The odds are that the problem is elsewhere, in the transfer of user input to web page content. It is easy to go wrong here, due to character encoding problems. More information about the situation (a URL would be good start, and so would a description of “something else”) is needed to analyze them, and the issue would fall under a different heading.
Regarding lowercasing, it should normally be performed server-side, not in CSS. Note that the text-transform: lowercase cannot properly handle Turkish or Azeri text, as it unconditionally maps both “I” and “İ” to “i”, not to “ı” and “i”. Proper support to them has been promised for Firefox 14, presumable to be used when the content language has been suitably defined using the lang attribute, but it will take a long time before such processing is common across browsers. In server-side processing, it is usually very easy to deal with this as a special case.
Does not seems like text-transform should cause any issues
http://jsfiddle.net/m9fpX/
are you setting
<meta http-equiv="content-type" content="text/html; charset=UTF-8"/>

Resources