Tips for Beginner with PyQt5 and UTF8 - qt

I am learning to build GUIs using PyQt5 (on Windows).
I came accross a piece of code with explicit translation into UTF8 in PyQt4 (not PyQt5):
a=QApplication(args)
button=QPushButton(a.trUtf8("éàùè"),None)
After a little bit of googling I found that trUtf8 is deprecated and that now Qt5 assumes that your code is already UTF8 formatted (link):
QObject::trUtf8() and QCoreApplication::Encoding are deprecated. Qt assumes that the source code is encoded in UTF-8.
So, I'd like to know what I should do to comply with this statement.
Even if I am not looking for a vim oriented solution, I'd like to add that I am using Vim to develop my code. So, I added the following options to my setup:
set encoding=utf-8
set fileencoding=utf-8
Is it correct, is it enough?
More generally, what do you need to be UTF8 compliant with Qt5?
Thank you!

The tr function has nothing to do with encoding/decoding per se - it is used for marking out which literal strings may require translation in applications which support multiple languages (if you're familiar with gnu gettext, it performs a similar role to the _() function).
Strings which are marked for translation will only get translated if a translator is installed. Otherwise, the strings will be passed through unchanged. The encoding/decoding aspect only comes into play because the C++ functions take in the source strings as const char* (i.e. bytes) and return a QString (i.e. unicode). So if there is no translation available, and the source string contains non-ASCII characters, a default decoding step would need to be performed. In Qt4, this had to be done explicitly with trUtf8 under some circumstances (see the docs for details) - but in Qt5 it can all handled by the tr function alone.
If you're using Python 3 with PyQt5, you can pretty much forget about all these issues, because literal strings are unicode by default, and UTF-8 is assumed as the source-code encoding. Also, PyQt5 will always return a python unicode object for any Qt function that would normally return a QString.
And of course, if you never intend to support multiple languages in your application, you can safely omit all usage of tr, as it would otherwise perform no useful function.

Related

Coldfusion Decrypt with special characters

How to decrypt Coldfusion with standard algorithm and special characters ?
For example:
<cfoutput>#encrypt("123",key, "CFMX_COMPAT", "UU")#</cfoutput>
result: #-_G4
And if I try to decrypt this
<cfoutput>#decrypt("#-_G4",key,"CFMX_COMPAT", "UU")#</cfoutput>
I will get an error.
I know that in this example i should switch # to ##. But what should i do with other special characters in my database ? How to auto escape all special characters for the decrypt function ?
I will get an error. I know that in this example i should switch # to
##. But what should i do with other special characters in my database ?
Nothing. You can store whatever characters you want. You will not get that error when you invoke decrypt() with your query values.
"Invalid CFML construct found.." is a compilation error. It occurs before any CF code is even executed. The only reason it occurs in your example is because the # is embedded within the CF code itself. So when the CF server parses and compiles that code, it see the un-escaped # sign as the start of some sort of variable and looks for a closing # sign. When it does not find one where it expects, it causes an error and the compilation fails. So unescaped # signs are only an issue when they are contained within that actual CF code (or a string evaluated as CF code).
When you retrieve the encrypted text from the db table, CF does not evaluate the query values as code. It just pulls the strings from the db and hands them off to the decrypt function. So that error cannot occur.
Having said all that, you really should not use CFMX_COMPAT - for anything. It is not really encryption at all, but rather a legacy obfuscation algorithm maintained for backward compatibility only. Instead use a real encryption algorithm like AES, Blowfish, etcetera. You might also want to use "base64" or "hex" instead of "UU", as they former are more portable. See the encrypt() docs for a list of the supported algorithms.
What are these CFMX_COMPAT IDs being used for? I'd avoid using them since this algorithm only works with ColdFusion and is guessable. If you want a safe, short, unguessable hashes for integers that can be used in URLs, Hashids is the best solution.
http://hashids.org/coldfusion/
This library is freely available for JavaScript, Ruby, Python, Java, Scala, PHP, Perl, CoffeeScript, Objective-C, C, C++11, Go, Lua, Elixir, ColdFusion, Groovy and for Node.js & .NET. The ColdFusion CFC version wasn't compatible with ColdFusion 8, so I used the Java version on that server.
It seems that single # is creating the issue. Just store the output of your encrypted string in a variable and the place it for decryption. It will work.
<cfset key = "15TLe44po">
<cfoutput>#encrypt("123",key, "CFMX_COMPAT", "UU")#</cfoutput>
<cfset encryptedText = encrypt("123",key, "CFMX_COMPAT", "UU") />
<cfoutput>#decrypt("#encryptedText#",key,"CFMX_COMPAT", "UU")#</cfoutput>
<cfabort>

How to obfuscate lua code?

I can't find anything on Google for some tool that encrypts/obfuscates my lua files, so I decided to ask here. Maybe some professional knows how to do it? (For free).
I have made a simple game in lua now and I don't want people to see the code, otherwise they can easily cheat. How can I make the whole text inside the .lua file to just random letters and stuff?
I used to program in C# and I had this .NET obfuscator called SmartAssembly which works pretty good. When someone would try check the code of my applications it would just be a bunch of letters and numbers together with chinese characters and stuff.
Anyone knows any program that can do this for lua aswell? Just load what file to encrypt, click Encrypt or soemthing, and bam! It works!?
For example this:
print('Hello world!')
would turn into something like
sdf9sd###&/sdfsdd9fd0f0fsf/&
Just precompile your files (chunks) and load binary chunks. luacallows you to strip debugging info. If that is not enough, define your own transformations on the compiled lua, stripping names where possible. There's not really so much demand for lua obfuscators though...
Also, you loose one of the main advantages of using an embedded scripting language: Extensibility.
The simplest obfuscation option is to compile your Lua code as others suggested, however it has two major issues: (1) the strings are still likely to be easily visible in your compiled code, and (2) the compiled code for Lua interpreter is not portable, so if you target different architectures, you need to have different compiled chunks for them.
The first issue can be addressed by using a pre-processor that (for example) converts your strings to a sequence of numbers and then concatenates them back at run-time.
The second issue is not easily addressed without changes to the interpreter, but if you have a choice of interpreters, then LuaJIT generates portable bytecode that will run across all its platforms (running the same version of LuaJIT); note that LuaJIT bytecode is different from Lua bytecode, so it can't be run by a Lua interpreter.
A more complex option would be to encrypt the code (possibly before compiling it), but you need to weight any additional mechanisms (and work on your part) against any possible inconvenience for your users and any loss you have from someone cracking the protection. I'd personally use something sufficiently simple to deter the majority of curious users as you likely stand no chance against a dedicated hacker anyway.
You could use loadstring to get a chunk then string.dump and then apply some transformations like cycling the bytes, swapping segments, etc. Transformations must be reversible. Then save to a file.
Note that anyone having access to your "encryptor" Lua module will know how to decrypt your file. If you make your encrypted module in C/C++, anyone with access to source will too, or to binary of Lua encryption module they could require the module too and unofuscate the code. With interpreted language it is quite difficult to do: you can raise the bar a bit via the above the techniques but raising it to require a significant amount of work (the onlybreal deterent) is very difficult AFAIK.
If you embed the Lua interpreter than you can do this from C, this makes it significantly harder (assuming a Release build with all symbols stripped), person would have to be comfortable with stepping through assembly but it only takes one capable person to crack the algorithm then they can make the code available to others.
Yo still interested in doing this? :)
I thought I'd add some example code, since the answers here were helpful, but didn't get us all the way there. We wanted to save some lua table information, and just not make it super easy for someone to inject their own code. serialize your table, and then use load(str) to make it into a loadable lua chunk, and save with string.dump. With the 'true' parameter, debug information is stripped, and there's really not much there. Yes you can see string keys, but it's much better than just saving the naked serialized lua table.
function tftp.SaveToMSI( tbl, msiPath )
assert(type(tbl) == "table")
assert(type(msiPath) == "string")
local localName = _GetFileNameFromPath( msiPath )
local file,err = io.open(localName, "wb")
assert(file, err)
-- convert the table into a string
local str = serializer.Serialize( tbl )
-- create a lua chunk from the string. this allows some amount of
-- obfuscation, because it looks like gobblygook in a text editor
local chunk = string.dump(load(str), true)
file:write(chunk)
file:close()
-- send from /usr to the MSI folder
local sendResult = tftp.SendFile( localName, msiPath )
-- remove from the /usr folder
os.remove(localName)
return sendResult
end
The output from one small table looks like this in Notepad++ :
LuaS У
Vx#w( # АKА└АJБ┴ JА #
& А &  name
Coulombmetervalue?С╘ ажў

Supporting long unicode filepaths with System.Data.SQLite

I'm developing an application that needs to be able to create & manipulate SQLite databases in user-defined paths. I'm running into a problem I don't really understand. I'm testing my stuff against some really gross sample data with huge unwieldy unicode paths, for most of them there isn't a problem, but for one there is.
An example of a working connection string is:
Data Source="c:\test6\意外な高価で売れるかも? 出品は手順を覚えれば後はかんたん!\11オークションストアの出品は対象外とさせていただきます。\test.db";Version=3;
While one that fails is
Data Source="c:\test6\意外な高価で売れるかも? 出品は手順を覚えれば後はかんたん!\22今やPCライフに欠かせないのがセキュリティソフト。そのため、現在何種類も発売されているが、それぞれ似\test.db";Version=3;
I'm using System.Data.SQLite v1.0.66.0 due to reasons outside of my control, but I quickly tested with the latest, v1.0.77.0 and had the same problems.
Both when attempting to newly create the test.db file or if I manually put one there and it's attempting to open, SQLiteConnection.Open throws an exception saying only "Unable to open the database file", with the stack trace showing that it's actually System.Data.SQLite.SQLite3.Open that is throwing.
Is there any way I can get System.Data.SQLite to play nicely with these paths? A workaround could be to create and manipulate my databases in a temporary location and then just move them to the actual locations for storage, since I can create and manipulate files normally otherwise. That's kind of a last resort though.
Thank you.
I am guessing you are on a Japanese-locale machine where the default system encoding (ANSI code page) is cp932 Japanese (≈Shift-JIS).
The second path contains:
ソ
which encodes to the byte sequence:
0x83 0x5C
Shift-JIS is a multibyte encoding that has the unfortunate property of sometimes re-using ASCII code units in the trail byte. In this case it has used byte 0x5C which corresponds to the backslash \. (Though this typically displays as a yen sign in Japanese fonts, for historical reasons.)
So if this pathname is passed into a byte-based API, it will get encoded in the ANSI code page, and you won't be able to tell the difference between a backslash meant as a directory separator and one that is a side-effect of multi-byte encoding. Consequently any path with one of the following characters in will fail when accessed with a byte-based IO method:
―ソЫⅨ噂浬欺圭構蚕十申曾箪貼能表暴予禄兔喀媾彌拿杤歃畚秉綵臀藹觸軆鐔饅鷭偆砡纊犾
(Also any pathname that contains a Unicode character not present in cp932 will naturally fail.)
It would appear that behind the scenes SQLite is using a byte-based IO method to open the filename it is given. This is unfortunate, but extremely common in cross-platform code, because the POSIX C standard library is defined to use byte-based filenames for operations like file open().
Consequently using the C stdlib functions it is impossible to reliably access files with non-ASCII names. This sad situation inherits into all sorts of cross-platform libraries and languages written using the stdlib; only tools written with specific support for Win32 Unicode filenames (eg Python) can reliably access all files under Windows.
Your options, then, are:
avoid using non-ASCII characters in the path name for your db, as per the move/rename suggestion;
continue to rely on the system locale being Japanese (ANSI code page=932), and just rename files to avoid any of the characters listed above;
get the short (8.3) filename of the file in question and use that instead of the real one—something like c:\test6\85D0~1\22PC~1\test.db. You can use dir /x to see the short-filenames. They are always pure ASCII, avoiding the encoding problem;
add some code to get the short filename from the real one, using GetShortPathName. This is a Win32 API so you need a little help to call it from .NET. Note also short filenames will still fail if run on a machine with the short filename generation feature disabled;
persuade SQLite to add support for Windows Unicode filenames;
persuade Microsoft to fix this problem once and for all by making the default encoding for byte interfaces UTF-8, like it is on all other modern operating systems.

Is there a required safe way of storing data in QSettings in a cross platform environment?

The mac version of my application has just started breaking its full screen and normal layouts which I save and restore using QSettings. Even old versions of my application are now playing up for my customers.
I was just googling for something similar when I found a bug report which contained an interesting line:
QSettings s;
restoreState(s.value(QString::fromLocal8Bit("state")).toByteArray());
When saving to the computers plist's or windows registry do I have to format the data in this fromLocal8bit()?
http://bugreports.qt-project.org/browse/QTBUG-8631
http://bugreports.qt-project.org/secure/attachment/13400/main.cpp
It's the data that is encoded, it's just the literal "state". The values are properly encoded and decoded if you use QByteArray or QString
.
The QString::fromLocal8Bit() part is for converting the string literal in the source file to a unicode string. It's good practice to stick to ASCII in source files, as other encodings such as UTF-8 usually don't work with all compilers, especially MSVC.
To convert literals to QString I would suggest using QLatin1String:
QLatin1String("state")
fromLocal8Bit() is strictly speaking false, as the local 8bit encoding has nothing to do with the source file encoding, where the literal comes from.
But as "state" doesn't contain any non-ascii characters, it shouldn't matter anyway.

How to Decode ASN.1 format to CSV format using Unix Programing

I have ASN format files i have to convert into CSV format also readable one
I need a Decoder with some advanced options like schedule,Auto FTP like that
Pretty old thread but it still comes at top on Google search and it will be good for people to get an answer. Not exactly Unix programming but you can find a "generic" Javascript based ASN decoder at http://lapo.it/asn1js/.
You can also download this and run natively on your box.
Erlang provides very good support for reading and writing BER-, DER- and PER-encoded ASN.1 packets. The Erlang compiler even accepts ASN.1 syntax natively and produces a custom codec as an Erlang module.
It's a whole new programming language for most people though, so whether it's worth learning it just for this exercise, I'll leave up to you. It is a very fun language to learn, however, and will teach you a very different way to think about programming.
You could have a look at asn1 compiler.
It converts asn.1 syntax files to C code.
As Marcelo noted in you question, you didn't exactly say precisely what you need, so can't tell if it covers all your bases, but you will be able to compile the code to binary (from C code, obviously)
There is an open source package called asn1c which will do ASN.1 encoding and decoding. Its a C library that you need to build and then write code around to implement your program. In order to build the library, it requires the ASN.1 syntax file that is used to construct the encoded messages. When decoding, one option is to output the data to an XML file which you would then need to convert to a CSV file somehow. At a minimum, it supports BER, XER, and PER.
In Python, there is also the PyASN1 library and tools: http://pyasn1.sourceforge.net/

Resources