Understanding hex output of raw 8-bit 8000Hz PCM sine wave - hex

Using Audacity, I generated a 1Hz Sine Wave with a 1 second length and 1.0 amplitude. This resulted in the following wave as expected.
With the Audacity sample size set to 8000Hz, I then exported the audio as RAW (header-less) Signed 8-bit PCM which resulted in an 8000 byte file (each byte is an 8-bit number between -128 and +127).
Opening up the .raw file in HxD and then setting the 'Bytes per row' to 1 and the Offset to decimal shows 8000 lines, each line showing the 8-bit number in Hex.
I can see that there are 10 0's then 10 1's then 10 2's and so on but once it goes to 16, there are 11 16's but then 10 17's and 10 18's. My question is, why are there 10 of some numbers and 11 of others?

This is just the shape of the sine wave. As you get closer to the maximum the curve is flatter so you get more equal sample values.

The left column can't be hex. It must be the sample time offset. The right column is measured value. What are the values of the right column when it's greater than 9?

Related

When dealing with hexadecimal numbers how do I use bit offset and length to get a value?

I have the following number
0000C1FF61A40000
The offset or start is 36 or 0x23
The length of the number is 12 or 0xc
Can someone help me understand how to get the resulting value? I thought the offset meant what pair of hex numbers to start with and then length would be how many to grab. There definitely aren't 36 pairs, only 8. Not sure how I'd do a length of 12 with only 8.
Each hex digit represents four binary bits. Therefore your offset of 36 bits (which BTW is 0x24, not 0x23) is equivalent to 9 hex digits. So discard the rightmost 9 digits from your original number, leaving you with 0000C1F.
Then the length of the number you want is 12 bits, which is 3 hex digits. So discard all but the rightmost 3 digits, leaving you with C1F as the answer.
If the numbers of bits had not been nice multiples of 4 then you would have had to convert the original hex number into binary, then discard offset number of bits from the right, retain only the rightmost length bits from the result, and finally convert those length bits back into hex.

Representing decimal numbers in binary

How do I represent integers numbers, for example, 23647 in two bytes, where one byte contains the last two digits (47) and the other contains the rest of the digits(236)?
There are several ways do to this.
One way is to try to use Binary Coded Decimal (BCD). This codes decimal digits, rather than the number as a whole into binary. The packed form puts two decimal digits into a byte. However, your example value 23647 has five decimal digits and will not fit into two bytes in BCD. This method will fit values up to 9999.
Another way is to put each of your two parts in binary and place each part into a byte. You can do integer division by 100 to get the upper part, so in Python you could use
upperbyte = 23647 // 100
Then the lower part can be gotten by the modulus operation:
lowerbyte = 23647 % 100
Python will directly convert the results into binary and store them that way. You can do all this in one step in Python and many other languages:
upperbyte, lowerbyte = divmod(23647, 100)
You are guaranteed that the lowerbyte value fits, but if the given value is too large the upperbyte value many not actually fit into a byte. All this assumes that the value is positive, since negative values would complicate things.
(This following answer was for a previous version of the question, which was to fit a floating-point number like 36.47 into two bytes, one byte for the integer part and another byte for the fractional part.)
One way to do that is to "shift" the number so you consider those two bytes to be a single integer.
Take your value (36.47), multiply it by 256 (the number of values that fit into one byte), round it to the nearest integer, convert that to binary. The bottom 8 bits of that value are the "decimal numbers" and the next 8 bits are the "integer value." If there are any other bits still remaining, your number was too large and there is an overflow condition.
This assumes you want to handle only non-negative values. Handling negatives complicates things somewhat. The final result is only an approximation to your starting value, but that is the best you can do.
Doing those calculations on 36.47 gives the binary integer
10010001111000
So the "decimal byte" is 01111000 and the "integer byte" is 100100 or 00100100 when filled out to 8 bits. This represents the float number 36.46875 exactly and your desired value 36.47 approximately.

How to calculate the maximum of data bits for each QR code?

Having some information for QR version 40 (177*177 modules) with correction level L (7% error correction)
Version: 40
Error Correction Level: L
Data bits: 23.648
Numeric Mode: 7089
Alphanumeric Mode: 4296
Byte Mode: 2953
I don’t know about these points:
Does 1 module equal 1 bit?
How to calculate the maximum number of data bits in a QR code type? e.g Why do we have 23,648 for data bits?
How to convert data bits to Numeric/Alphanumeric in a QR code type? e.g. why do we have 7,089 for Numeric and 4,296 for Alphanumeric?
Thanks all!
The derivation of the numbers to which you refer is a result of several distinct steps performed when generating the symbol described in detail by ISO/IEC 18004.
Any formula for the data capacity will be necessarily awkward and unenlightening since many of the parameters that determine the structure of QR Code symbols have been manually chosen and therefore implementations must generally resort to including tables of constants for these non-computed values.
How to derive the number of usable data bits
Essentially the total number of data modules for a chosen symbol version would be the total symbol area less any function pattern modules and format/version information modules:
DataModules = Rows × Columns − ( FinderModules + AlignmentModules + TimingPatternModules ) − ( FormatInformationModules + VersionInformationModules )
The values of these parameters are constants defined per symbol version.
Some of these data modules are then allocated to error correction purposes as defined by the chosen error correction level. What remains is the usable data capacity of the symbol found by treating each remaining module as a single bit:
UsableDataBits = DataModules − ErrorCorrectionBits
How to derive the character capacity for each mode
Encoding of the input data begins with a 4-bit mode indicator followed by a character count value whose length depends on the version of the symbol and the mode. Then the data is encoded according to the rules for the particular mode resulting in the following data compaction:
Numeric Groups of 3 characters into 10 bits; 2 remainders into 7 bits; 1 remainder into 4 bits.
Alphanumeric Groups of 2 characters into 11 bits; 1 remainder into 6 bits.
Byte Each character into 8 bits.
Kanji Each wide-character into 13 bits.
Although it does not affect the symbol capacity, for completeness I'll point out that a 4-bit terminator pattern is appended which may be truncated or omitted if there is insufficient capacity in the symbol. Any remaining data bits are then filled with a padding pattern.
Worked Example
Given a version 40 symbol with error correction level L.
The size is 177×177 = 31329 modules
There are three 8×8 finder patterns (192 modules), forty six 5×5 alignment patterns (1150 modules) and 272 timing modules, totalling 1614 function pattern modules.
There are also 31 format information modules and 36 version information modules, totalling 67 modules.
DataModules = 31329 − 1614 − 67 = 29648
Error correction level L dictates that there shall be 750 8-bit error correction codewords (6000 bits):
UsableDataBits = 29648 − 6000 = 23648
The character count lengths for a version 40 symbol are specified as follows:
Numeric 14 bits.
Alphanumeric 13 bits.
Byte 16 bits.
Kanji 12 bits.
Consider alphanumeric encoding. From the derived UsableDataBits figure of 23648 bits available we take 4 bits for the mode indicator and 13 bits for the character count leaving just 23631 for the actual alphanumeric data (and truncatable terminator and padding.)
You quoted 4296 as the alphanumeric capacity of a version 40-L QR Code symbol. Now 4296 alphanumeric characters becomes exactly 2148 groups of two characters each converted to 11 bits, producing 23628 data bits which is just inside our symbol capacity. However 4297 characters would produce 2148 groups with one remainder character that would be encoded into 6 bits, which produces 23628 + 6 bits overall – exceeding the 23631 bits available. So 4296 characters is clearly the correct alphanumeric capacity of a type 40-L QR Code.
Similarly for numeric encoding we have 23648−4−14 = 23630 bits available. Your quoted 7089 is exactly 2363 groups of three characters each converted to 10 bits, producing 23630 bits – exactly filling the bits available. Clearly any further characters would not fit so we have found our limit.
Caveat
Whilst the character capacity can be derived using the above procedure in practise QR Code permits encoding the input using multiple modes within a single symbol and a decent QR Code generator will switch between modes as often as necessary to optimise the overall data density. This makes the whole business of considering the capacity limits much less useful for open applications since they only describe the pathological case.

What's the significance of the bit group size in base64 or base32 encoding (RFC 4648)?

Why would they chose to use a 24-bit or 40-bit (that's really odd) bit group/word size for base 64 and base 32 respectively.
Specifically, can someone explain why the the least common multiple is significant?
lcm(log2(64), 8) = 24
lcm(log2(32), 8) = 40
Base 64 encoding basically involves taking a stream of 8-bit bytes and transforming it to a stream of 6-bit characters that can be represented by printable ASCII characters.
Taking a single byte at a time means you have one 6 bit character with 2 bits left over.
Taking two bytes (16 bits) means you have two 6-bit characters with 4 bits left over.
Taking 3 bytes (24 bits) means you have three bytes that can be split exactly into 4 characters with no bits left over.
So the lcm of bytes size and character size is naturally the size you need to split your input into.
6 bit characters are chosen because this is the largest size that you can use printable ascii characters for all values. If you went up to 7 bits you would need non-printing characters.
The argument for base 32 is similar, but now you are using 5-bit characters, so the lcm of 8 and 5 is the word size. This character size allows for case insensitive printable characters, 6 bit characters require differentiating between upper and lower cases.

How to determine FFT resulted indexes Freq and plot the Amplitude/Frequency Graph

I Have a bit of a hypothetical question to understand this concept..
Let's say I captured a mono voice clip with 8000hz sample rate, that is 4096 bytes in data..
Feeding the First 512 Bytes(16bit encoding) through an FFT of size 256, will return me 128 values, which i convert to amplitude.
So my frequencies for this output are
FFT BIN #1
0: 0*8000/256
1: 1*8000/256
.
.
127: 127*8000/256
So far so good ey? So now i 3584 bytes of unprocessed data left. So i perform another fft of 256 size on 512 bytes of data. And get the same amount of results..
So for this do i again have frequencies of:
FFT BIN #2:
Example1:
0: 0*8000/256
1: 1*8000/256
.
.
127: 127*8000/256
or
FFT BIN #2
Example2:
128: 129*8000/256
139: 130*8000/256
.
.
255: 255*8000/256
Because I would like to plot this amplitude/freq graph. But i don't understand if all these fft bins should be overlapped on the same frequencies like examaple1, or spread out like the second example.
Or am I trying to do something that is completely redundant? Because what i want to accomplish is find the peak amp value of every 30-50ms time frame to use for comparison of other sound files..
If anyone can clear this out for me, I'd be very grateful.
Your FFT result bins represent the same set of frequencies in every FFT, as in your example #1, but for different slices of time.
Each FFT will allow you to plot magnitude vs. frequency for about a 12 mS window of time.
You could also vector sum the FFT magnitudes together to get a Welch method PSD (power spectral density) for a longer time frame.
If you want to find the peak amp value of every 30-50ms time frame, you just need to plot the amp spectra for signals in each of the time frames.
Also, if you take FFT of 256 samples for each frame, then you should get 129, not 128, frequency components. The first one is the DC component, and the last one is the Nyquist frequency component.

Resources