I'm looking into generating .scc captions that is where text is represented in hex. In the above shared link, it gives an example where a string: "(horn honking)" is represented as
a820 68ef f26e 2068 ef6e 6be9 6e67 2029
However when I try to use any of the available hex to ASCII converters I don't get the exact string: ¨ hïòn hïnkéng )
Does anyone know what conversion is used here?
ttconv can convert SCC files to text-based formats, like TTML, IMSC, SRT, VTT...
pip install ttconv
tt.py convert -i <input .scc file> -o <output .ttml file>
Related
I have recently adding a fix for atom-script package to resolve garbled text output in compile&run Java code in non-English Windows environment. (Issue #1166, PR #2471)
After this, now in release script package 3.32.1, javac has options -J-Dfile.encoding=UTF-8 and -encoding UTF-8 both. (diff is here) I have just realized that it is better to provide actual encoding of the current active editor which holds the target source code for -encoding option. After some research, I have learnt that the encoding can be acquired by
atom.workspace.getActiveTextEditor().getEncoding()
but, after I have tested in Japanese edition Windows, it returns shiftjis and this is not valid encoding name for javac command line option. (It should be MS932, SJIS or something similar.) I have no idea where I can get this type of encoding names without writing large conversion table for all possible encoding names. Is there any good utility for such purpose?
EDIT:
For demonstrating what I have supposed to do, I have created branch on my fork. Diff is here.
Getting current source code editor encoding by
const fileEncoding = getJavaTextEncodingName(atom.workspace.getActiveTextEditor().getEncoding())
and pass it to javac via
const cmd = `javac -encoding ${fileEncoding} -J-Dfile.encoding=UTF-8 -Xlint -sourcepath '${sourcePath}' -d '${tempFolder}' '${filepath}' && java -Dfile.encoding=UTF-8 -cp '${tempFolder}' ${classPackages}${className}`
And, function "getJavaTextEncodingName()" is the core of the question.
function getJavaTextEncodingName(atomTextEncodingName) {
switch (atomTextEncodingName) {
case "shiftjis" :
return "MS932"
}
return "UTF-8"
}
It is obvious that this is converting "shiftjis" to "MS932" but, it is not so beautiful if we implement all possible encoding name conversions here, so I am seeking better alternative.
I am writing a QT5 application (with QT Creator) which uses special characters like zodiac signs. This code works perfectly fine on Linux Mint 14:
QString s = QString::fromUtf8("\u2648");
But when I compile it on Windows XP SP3 get a compiler warning which says that the current codepage is cp1252 and the character \u2648 cannot be converted. When I run the program this character is displayed as a question mark.
According to my system settings UTF8(codepage 65001) is installed on my Windows.
(Note, I have not tried this, and I don't know which compiler you are using, and am completely unfamiliar with QT, so I could be wrong. The following is based on general knowledge about Unicode on Windows.)
On Windows, 8-bit strings are generally assumed to be in the current codepage of the system (also called the "ANSI" codepage). This is never UTF-8. On your system, it's apparently cp1252. So there are actually two things going wrong:
You are specifying a Unicode character, which the compiler tries to covert to the correct codepage. On Windows, this results in a compile time error, because cp1252 doesn't have a code point to represent u+2648.
But assuming that the code would compile, it would still not work. You pass this string, which would be in in cp1251 to fromUtf8, which wants a UTF-8 string. As the string is not valid UTF-8, this would likely result in a runtime error.
On your Linux system, both works "by accident", because it uses UTF-8 for 8-bit strings.
To get this right, specify the 8-bit string in UTF-8 right away:
QString s = QString::fromUtf8("\xE2\x99\x88");
Here is my advice to get everithing work fine:
There is only one encoding type UTF-8! Use it everywhere if possible. So, in QtCreator settings set default codepage for sources UTF-8.
You can convert your source code in QtCreator: edit -> choose encoding and there reload in codepage. If it can't be done, use linux console application iconv this way:
iconv -f cp1252 -t utf-8 your_source_in_cp1251.cpp > your_source_in_utf8.cpp
I use this code snippet for C-strings in my source codes: in main.cpp add #include <QTextCodec>, and then do:
// For correct encoding
QTextCodec *codec = QTextCodec::codecForName("UTF-8");
QTextCodec::setCodecForCStrings(codec);
So I am trying to compare a binary file I make when I compile with gcc to an sample executable that is provided. So I used the command diff and went like this
diff asgn2 sample-asgn2
Binary files asgn2 and sample-asgn2 differ
Is there any way to see how they differ? Instead of it just displaying that differ.
Do a hex dump of the two binaries using hexdump. Then you can compare the hex dump using your favorite diffing tool, like kdiff3, tkdiff, xxdiff, etc.
Why don't you try Vbindiff? It probably does what you want:
Visual Binary Diff (VBinDiff) displays files in hexadecimal and ASCII (or EBCDIC). It can also display two files at once, and highlight the differences between them. Unlike diff, it works well with large files (up to 4 GB).
Where to get Vbindiff depends on which operating system you are using. If Ubuntu or another Debian derivative, apt-get install vbindiff.
I'm using Linux,in my case,I need a -q option to just show what you got.
diff -q file1 file2
without -q option it will show which line is differ and display that line.
you may check with man diff to see the right option to use in your UNIX.
vbindiff only do byte-to-byte comparison. If there is just one byte addition/deletion, it will mark all subsequent bytes changed...
Another approach is to transform the binary files in text files so they can be compared with the text diff algorithm.
colorbindiff.pl is a simple and open-source perl script which uses this method and show a colored side-by-side comparison, like in a text diff. It highlights byte changes/additions/deletions. It's available on GitHub.
Every executable must have an ELF header?
Also i would like to know why libraries and header's properties are often associated with HEX values; what is this HEX related to? Why HEX and not just binary code or something else.
I'm referring to the HEX values that comes up with the use of ldd and readelf for example, the 2 utilities often used under linux.
This question is for a generic OS and is not targeting a specific one, the architecture is supposed to be X86 or ARM.
Every executable must have an ELF header
Yes, every ELF file begins with an ELF file header. If it doesn't, it's not a valid ELF file by definition.
Why HEX and not just binary code or something else
You appear to be very confused about what HEX means. Any integer number can be written in many different representations. Decimal (base-10), octal (base-8), hex (base-16) are the most common ones, but base-20 is not unheard of. It's just a number, regardless of how you choose to represent it.
I want to know how to use convert command to convert EPS img to JPG.
I dont have Linux M/c but I am using Cygwin. I have searched, but nothing is working. I always get invalid argument error.
$ convert "/cygdrive/e/pdf/B313.eps" "/cygdrive/e/macro/B313.JPG"
Invalid Parameter - /e
It will be great if could solve this problem.
I cannot test it for cygwin but the following is working under unix:
convert -density 50 -antialias -colors 128 -background white -normalize -units PixelsPerInch -quality 100 /path/to/eps/test.eps test.jpg
Maybe you have to use backslashes for the file path under windows
$ convert "\cygdrive\e\pdf\B313.eps" "\cygdrive\e\macro\B313.JPG"
And try it with absolute paths like c:\path\ ...
Keep in mind that there's another convert command under windows which converts between file systems. Maybe the wrong one is being invoked.