Observing the sequence of I, P and B frames - codec

I am using an X264 encoder. How can I possibly see the sequence of I, P, B frames how they are being encoded while encoding I mentioned these parameters
likely
x264 --crf 23 --tune fastdecode --fps 64 --keyint 1 --min-keyint 1 --no-scenecut --input-res 4096*2048 --bframes 3 -o filename
Is it possible to see these sequences while decoding (using ijkplayer)?

Add additional parameter "--log-level debug" for x264 encoder, and you will see encode order and picture order(POC).
An example of output is here: https://i.stack.imgur.com/GTiMS.png

Related

How to read GDB 13 database? And is there any easy way to clean this data?

5
0001 -417.031
C 1.04168, -0.05620, -0.07148 1.041682, -0.056200, -0.071481
H 2.15109, -0.05620, -0.07150 2.130894, -0.056202, -0.071496
H 0.67187, 0.17923, -1.09059 0.678598, 0.174941, -1.072044
H 0.67188, 0.70866, 0.64196 0.678613, 0.694746, 0.628980
H 0.67188, -1.05649, 0.23421 0.678614, -1.038285, 0.228641
8
0002 -711.117
C 0.99571, 0.01149, -0.09922 0.995914, 0.011511, -0.099221
C 2.51489, 0.01148, -0.09922 2.514686, 0.011466, -0.099226
H 0.61911, 0.74910, -0.83887 0.597259, 0.729877, -0.819596
H 0.61911, 0.28325, 0.90938 0.597259, 0.276170, 0.883106
H 0.61909, -0.99785, -0.36818 0.597278, -0.971531, -0.361167
H 2.89151, 1.02083, 0.16973 2.913322, 0.994509, 0.162719
H 2.89149, -0.26027, -1.10783 2.913341, -0.253192, -1.081553
H 2.89149, -0.72612, 0.64042 2.913341, -0.706900, 0.621148
These two data points are from chemical database GDB 13. I try to understand what these numbers are representing. I know 5 and 8 are atomic number; 0001 and 0002 are atom id; and -417.031 and 711.117 are atomization energies. However, I don't quite understand what those number below means. However, I am pretty sure they are the geometry representation in 3 dimension space. If that is the geometry representation in 3 dimension space, then why there are 6 numbers in there. How to read those 6 numbers?
I am also trying to use BOB representation to reform the data, is there any ways to do that instead of hard coding? If not, I am using R, is R able to do that ?
Have a look at the original paper in ‎Int. J. Quantum Chem., 2015, 115, 1058-1073 (DOI).
The Extended XYZ format is explained in Fig. 7 of the article.
You are right that the first line denotes the number of atoms k, while the second line consists of an identifier and the energy of atomization for the particular molecule.
The next k lines contain two sets of cartesian coordinates (in Angström). The left block contains the x,y,z coordinates from a force-field calculation (UFF), while the coordinates on the right stem from a DFT calculation.
A common tool to read and convert coordinate files in various formats is Open Babel. Have a look at th accompanying paper in J. Cheminformatics, 2013,3:33 (DOI)
There exist various bindings for Open Babel, and apparently, there is is one for r too. Have a look.
I just ran a quick test on the first entry in the data from the supplement of the paper by Mathias Rupp using Open Babel 2.3.2:
obabel -ixyz c1.xyz -oxyz -O c1a.xyz
Apparently, only the left coordinate block is read in! If you suspect that the coordinates from UFF and DFT calculations differ significantly, you're probably on your own. However, given that the file format is documented, this should not be a major problem.
If you don't mind a remark, the title of your question is somewhat misleading. The data in question is only remotely related to GDB-13. To my knowledge, the GDB files from Jean-Louis Reymond do not contain any coordinates. They are large collections SMILES strings, from which coordinates would have to be generated for each entry.

How to calculate the maximum of data bits for each QR code?

Having some information for QR version 40 (177*177 modules) with correction level L (7% error correction)
Version: 40
Error Correction Level: L
Data bits: 23.648
Numeric Mode: 7089
Alphanumeric Mode: 4296
Byte Mode: 2953
I don’t know about these points:
Does 1 module equal 1 bit?
How to calculate the maximum number of data bits in a QR code type? e.g Why do we have 23,648 for data bits?
How to convert data bits to Numeric/Alphanumeric in a QR code type? e.g. why do we have 7,089 for Numeric and 4,296 for Alphanumeric?
Thanks all!
The derivation of the numbers to which you refer is a result of several distinct steps performed when generating the symbol described in detail by ISO/IEC 18004.
Any formula for the data capacity will be necessarily awkward and unenlightening since many of the parameters that determine the structure of QR Code symbols have been manually chosen and therefore implementations must generally resort to including tables of constants for these non-computed values.
How to derive the number of usable data bits
Essentially the total number of data modules for a chosen symbol version would be the total symbol area less any function pattern modules and format/version information modules:
DataModules = Rows × Columns − ( FinderModules + AlignmentModules + TimingPatternModules ) − ( FormatInformationModules + VersionInformationModules )
The values of these parameters are constants defined per symbol version.
Some of these data modules are then allocated to error correction purposes as defined by the chosen error correction level. What remains is the usable data capacity of the symbol found by treating each remaining module as a single bit:
UsableDataBits = DataModules − ErrorCorrectionBits
How to derive the character capacity for each mode
Encoding of the input data begins with a 4-bit mode indicator followed by a character count value whose length depends on the version of the symbol and the mode. Then the data is encoded according to the rules for the particular mode resulting in the following data compaction:
Numeric Groups of 3 characters into 10 bits; 2 remainders into 7 bits; 1 remainder into 4 bits.
Alphanumeric Groups of 2 characters into 11 bits; 1 remainder into 6 bits.
Byte Each character into 8 bits.
Kanji Each wide-character into 13 bits.
Although it does not affect the symbol capacity, for completeness I'll point out that a 4-bit terminator pattern is appended which may be truncated or omitted if there is insufficient capacity in the symbol. Any remaining data bits are then filled with a padding pattern.
Worked Example
Given a version 40 symbol with error correction level L.
The size is 177×177 = 31329 modules
There are three 8×8 finder patterns (192 modules), forty six 5×5 alignment patterns (1150 modules) and 272 timing modules, totalling 1614 function pattern modules.
There are also 31 format information modules and 36 version information modules, totalling 67 modules.
DataModules = 31329 − 1614 − 67 = 29648
Error correction level L dictates that there shall be 750 8-bit error correction codewords (6000 bits):
UsableDataBits = 29648 − 6000 = 23648
The character count lengths for a version 40 symbol are specified as follows:
Numeric 14 bits.
Alphanumeric 13 bits.
Byte 16 bits.
Kanji 12 bits.
Consider alphanumeric encoding. From the derived UsableDataBits figure of 23648 bits available we take 4 bits for the mode indicator and 13 bits for the character count leaving just 23631 for the actual alphanumeric data (and truncatable terminator and padding.)
You quoted 4296 as the alphanumeric capacity of a version 40-L QR Code symbol. Now 4296 alphanumeric characters becomes exactly 2148 groups of two characters each converted to 11 bits, producing 23628 data bits which is just inside our symbol capacity. However 4297 characters would produce 2148 groups with one remainder character that would be encoded into 6 bits, which produces 23628 + 6 bits overall – exceeding the 23631 bits available. So 4296 characters is clearly the correct alphanumeric capacity of a type 40-L QR Code.
Similarly for numeric encoding we have 23648−4−14 = 23630 bits available. Your quoted 7089 is exactly 2363 groups of three characters each converted to 10 bits, producing 23630 bits – exactly filling the bits available. Clearly any further characters would not fit so we have found our limit.
Caveat
Whilst the character capacity can be derived using the above procedure in practise QR Code permits encoding the input using multiple modes within a single symbol and a decent QR Code generator will switch between modes as often as necessary to optimise the overall data density. This makes the whole business of considering the capacity limits much less useful for open applications since they only describe the pathological case.

On the nature of Information and Entropy definitions

I was looking at the Shannon's definitions if intrinsic information and entropy (of a "message").
Honestly, I fail to intuitively grasp why Shannon defined those two in terms of the logarithm (apart from the desirable "split multiplication into sum" property of logarithms, which is indeed desirable).
Can anyone help me to shed some light on this?
Thanks.
I believe that Shannon was working at Bell Labs when he developed the idea of Shannon entropy : the goal of his research was to best encode information, with bits (so 0 and 1).
This is the reason of the log2: it has to do with binary encoding of a message. If numbers that can take 8 different values are transmitted on a telecommunication line, signals of length 3 bits (log2(8) = 3) will be needed to transmit these numbers.
Shannon entropy is the minimum number of bits you will need to encode each character of a message (for any message written in any alphabet).
Let us take an example. We have the following message to encode with bits:
"0112003333".
The characters of the message are in {0,1,2,3}, so we would need at most log2(4) = 2 bits to encode the characters of this message. For example, we could use the following way to encode the characters:
0 would be coded by 00
1 would be coded by 01
2 would be coded by 10
3 would be coded by 11
The message would then be encoded like that: "00010110000011111111"
However we could do better if we chose to code the most frequent characters on only one bit and the other on two bits:
0 would be coded by 0
1 would be coded by 01
2 would be coded by 10
3 would be coded by 1
The message would then be encoded like that: "0010110001111"
So the entropy of "0112003333" is between 1 and 2 (it is 1.85, to be more precise).

Addition in hexadecimal

I may formulated the question a bit wrong. I need to calculate the IPv4 header checksum in hexadecimal with paper and pen. At this link http://en.wikipedia.org/wiki/IPv4_header_checksum
on the last example they do it.
I have a bit of problem understanding how they count directly in hexadecimal. When doing it on paper what if I get a number over 15 for example 48 what reminder will I use and what will I write down?
Anyone that can explain how to handle this?
Thank you and sorry for formulating the question wrong but I have changed it now:)
See http://www.youtube.com/watch?v=UGK8VyV1gLE which describes the process very well.
Counting in HEX (base 16) is just like counting in decimal (base 10) except that you only start carrying remainders when you count past F.
So in your example from a comment, it's just like counting in decimal with no remainders:
15
24
---
39
A simple true HEX addition is:
11
F
---
20
1 + F = 10 = 1 remainder + 1 = 20
15 over 48 is simple too:
15
48
---
5D
8 + 5 = D no remainder, 1 + 4 = 5 no remainder
Hexadecimal is just a representation of numbers. In order to have the computer helping you with the addition you will have to convert the hexadecimal represented numbers to a number itself then do the addition and then convert it back. This is not a conversion to binary as binary is also only a different representation.
If you do not want the conversion from hexadecimal you will have to explain why you do not want to have this conversion.
I suppose this may sound like a dumb answer, but it's the best I can give with the way you wrote the question.
Addition in hex works exactly the same as in decimal, except with 16 instead of 10 digits. So in effect, what you're asking is how to do addition in general (including in decimal.) In dec, 9 + 1 = 10. In hex, F + 1 = 10. Obviously, the same rules of addition apply in both.

Y = base64(X) where X is integer - is Y alphanumeric?

Additional details:
X is any positive integer 6 digits or less.
X is left-padded with zeros to maintain a width of 6.
Please explain your answer :)
(This might be better in the Math site, but figured it involves programming functions)
The picture from the german Wikipedia article is very helpful:
You see that 6 consecutive bits from the original bytes generate a Base64 value. To generate + or / (codes 62 and 63), you'd need the bitstrings 111110 and 111111, so at least 5 consecutive bits set.
However, look at the ASCII codes for 0...9:
00110000
00110001
00110010
00110011
00110100
00110101
00110110
00110111
00111000
00111001
No matter how you concatenate six of those, there won't be more than 3 consecutive bits set. So it's not possible to generate a Base64 string that contains + or / this way, Y will always be alphanumeric.
EDIT: In fact, you can even rule other Base64 values out like 000010 (C), so this leads to nice follow-up questions/puzzles like "How many of the 64 values are possible at all?".

Resources