I'm porting a Qt4 project to Qt5 (Qt 5.4.1 + VS2013), the project have some string translations. The source file is UTF-8 encoded. But today I found the piece of code won't work (They all worked well in Qt4).
this->paraList.push_back( QPair<QString,QString>( QString(tr("℃:")), QString(tr("Ω")) ) );
'paraList' is a QList, and the strings in it finally shown in a QTableWidget. They both show correctly in QLiguist, but when my application run, the centigrade symbol and the Ohm symbol don't be translated correctly, as below
But all other strings are translated correctly. My locale is zh_CN. Why these two characters are so special?
Problem is encoding. You are using non ASCII characters as translation pattern. There was some change in Qt5 how c-strings are converted (I don't remember details) and I'm suspecting this might be a problem.
Try use trUtf8 this should fix the problem.
Related
I have recently adding a fix for atom-script package to resolve garbled text output in compile&run Java code in non-English Windows environment. (Issue #1166, PR #2471)
After this, now in release script package 3.32.1, javac has options -J-Dfile.encoding=UTF-8 and -encoding UTF-8 both. (diff is here) I have just realized that it is better to provide actual encoding of the current active editor which holds the target source code for -encoding option. After some research, I have learnt that the encoding can be acquired by
atom.workspace.getActiveTextEditor().getEncoding()
but, after I have tested in Japanese edition Windows, it returns shiftjis and this is not valid encoding name for javac command line option. (It should be MS932, SJIS or something similar.) I have no idea where I can get this type of encoding names without writing large conversion table for all possible encoding names. Is there any good utility for such purpose?
EDIT:
For demonstrating what I have supposed to do, I have created branch on my fork. Diff is here.
Getting current source code editor encoding by
const fileEncoding = getJavaTextEncodingName(atom.workspace.getActiveTextEditor().getEncoding())
and pass it to javac via
const cmd = `javac -encoding ${fileEncoding} -J-Dfile.encoding=UTF-8 -Xlint -sourcepath '${sourcePath}' -d '${tempFolder}' '${filepath}' && java -Dfile.encoding=UTF-8 -cp '${tempFolder}' ${classPackages}${className}`
And, function "getJavaTextEncodingName()" is the core of the question.
function getJavaTextEncodingName(atomTextEncodingName) {
switch (atomTextEncodingName) {
case "shiftjis" :
return "MS932"
}
return "UTF-8"
}
It is obvious that this is converting "shiftjis" to "MS932" but, it is not so beautiful if we implement all possible encoding name conversions here, so I am seeking better alternative.
Here is the sample csv file in utf-8 format which can be opened in win7's notepad and the chinese character displayed properly ,please download it .
http://pan.baidu.com/s/1sj0ia4H
Open your cmd ,and set chcp 650001.
C:\Users\pengsir>sqlite3 e:\\test.db
SQLite version 3.8.4.3 2014-04-03 16:53:12
Enter ".help" for usage hints.
sqlite> create table ipo(name TEXT,method TEXT);
sqlite> .separator ","
sqlite> .import "e:\\tmp.csv" ipo
sqlite> select * from ipo;
000001,公开招募
000002,申请表抽ç¾é™é¢è®¤è´
000004,定å‘å‘è¡Œ
000005,银行储蓄å˜å•æ–¹å¼
000006,申请表抽ç¾é™é¢è®¤è´
000007,自办å‘è¡Œ
000008,自办å‘è¡Œ
000009,定å‘å‘è¡Œ
000010,定å‘å‘è¡Œ
000011,申请表抽ç¾ç‰é¢è®¤è´
sqlite>
why the same sqlite command can get proper display in sqlitemanager?
and how can i set to display chinese character in sqlite console?
In pysqlite3 , it can get right display in python console.
>>> import sqlite3
>>> con=sqlite3.connect("e:\\test.db")
>>> cur=con.cursor()
>>> cur.execute("select * from ipo;")
<sqlite3.Cursor object at 0x01751720>
>>> print(cur.fetchall())
[('000001', '公开招募'), ('000002', '申请表抽签限额认购'), ('000004', '定向发行'
), ('000005', '银行储蓄存单方式'), ('000006', '申请表抽签限额认购'), ('000007',
'自办发行'), ('000008', '自办发行'), ('000009', '定向发行'), ('000010', '定向发
行'), ('000011', '申请表抽签等额认购')]
>>>
This issue concers how
Command Prompt window
shows the characters, and is not about how sqlite3
prints the output;
As a simple demonstration here we absolutely exclude sqlite3 and look at the files by the type command:
Let's see whats happen in other different O.S., for example in OSX:
ISO-8859-1
correspond to (Windows latino 1), windows equivalent code page setting: chcp 819
UTF8
correspond to Unicode (UTF-8), windows equivalent code page setting: chcp 65001
Pretty the same behavior also happens in Windows:
use command chcp to inspect and/or setting-up your current code page
NOTICE: this is a screenshot of an Italian Windows XP and as you can see there is still no luck! :-( , in this case the cause consists in a leak of available fonts configurable in
command prompt properties in my "Windows XP" box:
I hope this is not the case of your "Windows Seven" box ( ..but if it is , please leave me a comment to be a more specific in this part of the answer ).
..when the problem switches to the "fonts available" then Additional Languages supports would be installed and still need forcing UTF-8 by a chcp 65001:
How to get proper fonts
follows the list of steps I followed to get the result on ITA WinXP SP2 as shown in the above screenshot:
Step 1 Install East Asian language files on your computer
lecture link: to install East Asian language files on your computer
In summary these two options have been both checked
and in "Advanced Tab" I've selected Chinese:
Step 2 Switch from raster to chinese font in the terminal/"Command Windows"
Extra Step 3 (Optional) Check font in notepad
Notepad can be useful for some inspections on fonts, for example open the temp.csv and play with fonts but be aware of: Necessary criteria for fonts to be available in a command window
Well the obvious problem is that Windows (pretty much in general) has a problem in dealing with UTF-8. Especially the command line tool is by default set to a country specific codepage rather than unicode.
Usually you can (temporarily) fix it by setting the codepage for the command-line session to utf-8, for example by typing:
chcp 65001
But the problem is that in your case this does not really fix it, since sqlite seems to still run with the default charset, and there does not seem to be any option to set the current sqlite3 session to unicode.
Still the good news above it all is, that your data is correct, and you can work with it correctly using sqlitemanager or similar tools, which are able to handle unicode appropriately.
To further substantiate this: If you open your original csv with Excel it probably also will give you messed up characters (since it usually does not default to unicode). Whereas LibreOffice will typically ask you for the encoding to use, and given unicode will show the correct text, but given a different encoding (eg: western europe, etc.) will give you the same result as excel (you can preview it there quite nicely, give it a shot).
Hope this helps!
I am writing a QT5 application (with QT Creator) which uses special characters like zodiac signs. This code works perfectly fine on Linux Mint 14:
QString s = QString::fromUtf8("\u2648");
But when I compile it on Windows XP SP3 get a compiler warning which says that the current codepage is cp1252 and the character \u2648 cannot be converted. When I run the program this character is displayed as a question mark.
According to my system settings UTF8(codepage 65001) is installed on my Windows.
(Note, I have not tried this, and I don't know which compiler you are using, and am completely unfamiliar with QT, so I could be wrong. The following is based on general knowledge about Unicode on Windows.)
On Windows, 8-bit strings are generally assumed to be in the current codepage of the system (also called the "ANSI" codepage). This is never UTF-8. On your system, it's apparently cp1252. So there are actually two things going wrong:
You are specifying a Unicode character, which the compiler tries to covert to the correct codepage. On Windows, this results in a compile time error, because cp1252 doesn't have a code point to represent u+2648.
But assuming that the code would compile, it would still not work. You pass this string, which would be in in cp1251 to fromUtf8, which wants a UTF-8 string. As the string is not valid UTF-8, this would likely result in a runtime error.
On your Linux system, both works "by accident", because it uses UTF-8 for 8-bit strings.
To get this right, specify the 8-bit string in UTF-8 right away:
QString s = QString::fromUtf8("\xE2\x99\x88");
Here is my advice to get everithing work fine:
There is only one encoding type UTF-8! Use it everywhere if possible. So, in QtCreator settings set default codepage for sources UTF-8.
You can convert your source code in QtCreator: edit -> choose encoding and there reload in codepage. If it can't be done, use linux console application iconv this way:
iconv -f cp1252 -t utf-8 your_source_in_cp1251.cpp > your_source_in_utf8.cpp
I use this code snippet for C-strings in my source codes: in main.cpp add #include <QTextCodec>, and then do:
// For correct encoding
QTextCodec *codec = QTextCodec::codecForName("UTF-8");
QTextCodec::setCodecForCStrings(codec);
In my Qt application my source code files are encoded as UTF-8. For the following code...
QMessageBox::critical(this, "Nepoznata pogreška", "Dogodila se nepoznata pogreška! Želite li zatvoriti ovaj program ?", QMessageBox::Yes, QMessageBox::No);
...when I show that message box, the character "š" wouldn't be displayed as "š", but as something strange. This is because Qt converts all C-strings as if they are encoded using LATIN-1. To solve this I've been using:
QMessageBox::critical(this, QString::fromUtf8("Nepoznata pogreška"), QString::fromUtf8("Dogodila se nepoznata pogreška! Želite li zatvoriti ovaj program ?"), QMessageBox::Yes, QMessageBox::No);
Is there a way to get rid of all the calls to QString::fromUtf8()?
Have you tried using QTextCodec::setCodecForCStrings(QTextCodec::codecForName("UTF-8"))?
setCodecForCStrings() had been deprecated.
Try instead,
QTextCodec::setCodecForLocale(QTextCodec::codecForName("UTF-8"));
It worked for me.
Regarding the "guess" that "Qt5 assumes all source files are UTF-8 encoded": Thiago Macieira explains the decision made by Qt's developers here.
The assumption can be disabled with QT_NO_CAST_FROM_ASCII according to the documentation.
I see console apps print colors and seen apps such as ffmpeg print text over itself instead of a new line. How do I print over an existing line? I want to display fps in my console app either at the very top or very bottom and have regular printfs go there and scroll normally.
I need this for windows, but this is meant to be cross platform, so I will eventually have a linux and mac implementation.
There is two simple possibilities which work on linux as well as windows, but only for one line:
printf("\b"); will return for one character, so you might count how many character you want to backspace and fire this in a loop, or you know that you only write n numbers and do it likeprintf("\b\b\b\b\b\b\b\b\b\b");
printf("text to be overwritten by next printf\r"); this will return the cursor to the beginning of the line, so any next printf will overwrite it. Make sure to write a string of same length or longer so you overwrite it entirely.
If you want to rewrite several lines, there is nothing so portable as ncurses, there is libs for it on practically every operating system, and you don't have to take care of the ANSI-differences.
edit: added link to ncurses wikipedia page, gives great overview and introduction, as well as link list and maybe a translation to your preferred language
Check out ncurses. It has bindings for most scripting languages.
You can use '\r' instead of '\n'.
The ASCII character number 8 (A.K.A. Ctrl-H, BS or Backspace) lets you back up one character. ASCII Character number 13 (A.K.A Ctrl-M, CR or Carriage Return) returns the cursor at the beggining of the line.
If you are working in C try putchar(8); and putchar(13);
The magic of the colors, cursor locating and bliking and so on are inside ANSI escape codes. Any text console capable of handling ANSI codes can use them just printing them out to console (i.e. by means of echo in a bash script or printf() function in C).
Unix terminals support ANSI escape sequences and Windows world used to support them back in old MS-DOS days, but the multibyte console support put an end to this. There is more information here. However there are other ways out of just ANSI sequences printing available on Windows. Moreover if you have Cygwin installed on your Windows maching ANSI codes work just as great as on any Unix terminal.
Many people mention Ncurses library that is the de-facto standard for any gui-like text based applications. What this library does is to hide all the terminal differences (Windows/Unix flavours) to represent the same information as identical as possible across all the platforms, though from my own experience I tell you this is not always true (i.e. typical text window frames change because the especial chars are not available under all character encodings). The counterpart of using ncurses is that it is a complete API and it is much harder to start out with it than simply writing out some ANSI escape sequences for simple things such as change the font color, cleaning screen or moving back the cursor to a random position.
For the sake of completeness I paste an example of use of ANSI sequence under Linux that changes the prompt to blue and shows the date:
PS1="\[\033[34m\][\$(date +%H%M)][\u#\h:\w]$ "
You can use Ncurses -
ncurses package is a subroutine library for terminal-independent screen-painting and input-event handling which presents a high level screen model to the programmer, hiding differences between terminal types and doing automatic optimization of output to change one screenfull of text into another
Depending on the platform which you are developing on there's probably a more powerful API which you could use, rather than old ASCII control codes.
e.g. If you are working on Win32 you can actually manipulate the console screen buffer directly.
A good place to start might be here
http://msdn.microsoft.com/en-us/library/ms683171(VS.85).aspx
I have been looking for similar functions/API which would allow me to access the console as something other than a stream of text for other platforms. Haven't found anything yet, but then again, I haven't been looking that hard.
Hope it helps.