Read n bits (not bytes) in qt? - qt

I'm very newbie in Qt. I have to read a binary file which contains a header. But in some positions I have to read 15 bits and 17 bits integers. Is there any functions to read n bits from a file in qt?
PS: I read this file using Matlab by using fread and ubitn

You can only read, and generally handle values in a minimum of 8 bits which is a char or 1 byte.
You can perform operations on single bit within a byte.
In your situation, you should read the sufficient amount of byte and then recast to a structure for instance.
Bitfield
struct {
unsigned double widthValidated : 15;
unsigned double heightValidated : 17;
} MyStruct;
MyStruct * ptr = (MyStruct *) &myDouble;
You can also use bit shifting:
double myValue;
double a = myValue & 0x7FFF; // 15 first bits
double b = (myValue >> 15) & 0x7FFFF; // 17 bits after the 15 first

Qt has a C++ interface. In C++, binary file access is octet-based. You can only read 8 bits at a time. So you'll have to assemble the integers in higher-level logic.

Related

Understanding output of pointer size in C Programming Language

I am trying to understand why this printf statement gives two different outputs; I think I have a decent understanding of one of the outputs.
Here is the code:
const char *ptr = "hello";
const char array[] = "hello";
//Question 2
printf("%zu %zu\n", sizeof(ptr),sizeof(array));
Now I understand why sizeof(array) returns six: this is because the length of "hello" is 6 plus an additional null terminator.
But I do not understand why sizeof(ptr) is 8; my guess is that all memory addresses in C occupy 8 bits of memory hence the size is 8. Is this correct?
The C language, itself, doesn't define what size a pointer is, or even if all pointers are the same size. But, yes, on your platform, the size of a char* pointer is 8 bytes.
This is typical on 64-bit platforms (the name implies 64-bit addressing which, with 8 bits per byte, is 64/8 = 8 bytes). For example, when compiling for a Windows 64-bit target architecture, sizeof(char*) is 8 bytes; however, when targeting the Windows 32-bit platform, sizeof(char*) will be 4 bytes.
Note also that the "hello" string literal/array is 6 bytes including the nul terminator.
sizeof(ptr) returns 8 because it's the size of a pointer on a typical 64-bit architecture. The pointer ptr is a pointer to a constant character, so its size is 8 bytes on a 64-bit system. The value stored in the pointer is the memory address of the first character in the string literal "hello", so ptr takes up 8 bytes of memory, not 6.
The size of a pointer is platform-dependent and can be different on different architectures. On a 32-bit system, the size of a pointer would typically be 4 bytes.

Pe32 Header - DataDirectory pointer size

I have 2 questions:
The Pe32 header consists of many subheaders. One of them is the optional header. MSDN says that the last element of IMAGE_OPTIONAL_HEADER is a pointer to the first IMAGE_DATA_DIRECTORY struct of the executable. When I look into the WIN32N.INC for NASM everything is the same as listed in MSDN with the difference that the pointer to the first struct has the size 8 bytes instead of 4 (like a normal 32bit pointer):
STRUCT IMAGE_OPTIONAL_HEADER
.Magic RESW 1
...
.DataDirectory RESQ 1 <----- why RESQ?
ENDSTRUC
When I want to copy the 16 DataDirectories out of the binary data I stored as a "variable" in NASM into an struct: Is it ok to create a struct with 32 entries (Export Directory RVA + size, Import Directory RVA + size, etc...) and have the pointer to the first DataDirectory struct in the optional header point to its beginning? Because otherwise there would be no way to get from the first element to the other ones, wouldn't it?
Can someone explain this?
DataDirectory is an array of structs, not a pointer.
MSDN tells you that the layout is actually
typedef struct _IMAGE_DATA_DIRECTORY {
DWORD VirtualAddress;
DWORD Size;
} IMAGE_DATA_DIRECTORY, *PIMAGE_DATA_DIRECTORY;
and your .INC file just defines it as a QWORD for some unknown reason.
WinNT.h in the SDK defines the optional header as:
...
DWORD NumberOfRvaAndSizes;
IMAGE_DATA_DIRECTORY DataDirectory[IMAGE_NUMBEROF_DIRECTORY_ENTRIES];
} IMAGE_OPTIONAL_HEADER32, *PIMAGE_OPTIONAL_HEADER32;
where IMAGE_NUMBEROF_DIRECTORY_ENTRIES is 16.
You should not just copy 16 items, the NumberOfRvaAndSizes member specifies the size and it could be < 16.

NASM ctypes SIMD - how to access 128-bit array returned to ctypes?

I have a NASM 64 dll called by ctypes. The program multiplies two 64-bit integers and returns a 128-bit integer, so I am using xmm SIMD instructions. It loops through 10,000 times and stores its results in a memory buffer created by malloc.
Here is the part of the NASM code where the SIMD calculations are performed:
cvtsi2sd xmm0,rax
mov rax,[pcalc_result_0]
cvtsi2sd xmm1,rax
PMULUDQ xmm0,xmm1
lea rdi,[rel s_ptr] ; Pointer
mov rbp,qword[rdi]
mov rcx,[s_ctr]
;movdqa [rbp + rcx],xmm0
movdqu [rbp + rcx],xmm0
add rcx,16
The movdqa instruction does not work (the program crashes, even though it's assembled with the align=16 directive). The movdqu instruction does work, but when I return the array to ctypes, I need to convert the return pointer to 128-bits, but there is no 128-bit ctypes datatype. Here's the relevant part of the ctypes code:
CallName.argtypes = [ctypes.POINTER(ctypes.c_double)]
CallName.restype = ctypes.POINTER(ctypes.c_int64)
n0 = ctypes.cast(a[0],ctypes.POINTER(ctypes.c_int64))
n0_size = int(a[0+1] / 8)
x0 = n0[:n0_size]
where x0 is the returned array converted to a usable form, but not to 128 bits.
There is a post at Handling 128-bit integers with ctypes that deals with passing 128-bit arrays in but not out.
My questions are:
-- Should I use an instruction other than movdqa or movdqu? Of the many SIMD instructions, these seem the most appropriate.
-- Python can handle integers up to any arbitrary size, but apparently ctypes can't. Is there any way to use 128-bit integers from ctypes when there is no ctypes size larger than 64 bits?
You can generate byte arrays containing 16 bytes representing a 128-bit integer and convert to and from byte format. This may not be aligned, so you should use movdqu. I would use an input/output parameter instead of a return value, so Python can manage the memory:
>>> import ctypes
>>> value = 0xaabbccddeeff
>>> int128 = ctypes.create_string_buffer(value.to_bytes(16,'little',signed=True))
>>> int128
<ctypes.c_char_Array_17 object at 0x000001ECCB1D41C8>
>>> int128.raw
b'\xff\xee\xdd\xcc\xbb\xaa\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00'
(NOTE: The buffer gets null-terminated, which is why it is 17 bytes)
Pass this writable buffer to your function, the function can write the result back to the same buffer. On return use the following to convert back to a Python integer:
>>> hex(int.from_bytes(int128.raw[:16],'little',signed=True))
'0xaabbccddeeff'

Is there a minimum string length for F() to be useful?

Is there a limit for short strings where using the F() macro brings more RAM overhead then saving?
For (a contrived) example:
Serial.print(F("\n"));
Serial.print(F("Hi"));
Serial.print(F("there!"));
Serial.print(F("How do you doyou how?"));
Would any one of those be more efficient without the F()?
I imagine it uses some RAM to iterate over the string and copy it from PROGMEM to RAM. I guess the question is: how much? Also, is heap fragmentation a concern here?
I'm looking at this purely from SRAM-conserving perspective.
From a purely SRAM-conserving perspective all of your examples are identical in that no SRAM is used. At run-time some RAM is used, but only momentarily on the stack. Keep in mind that calling println() (w/o any parameters) uses some stack/RAM.
For a single character it will take up less space in flash if a char is passed into print or println. For example:
Serial.print('\n');
The char will be in flash (not static RAM).
Using
Serial.print(F("\n"));
will create a string in flash memory that is two bytes long (newline char + null terminator) and will additionally pass a pointer to that string to print which is probably two bytes long.
Additionally at runtime, using the F macro will result in two fetches ('\n' and the null terminator) from flash. While fetches from flash are fast, passing in a char results in zero fetches from flash, which is a tiny bit faster.
I don't think there is any minimum size of the string to be useful. If you look at how the outputting is implemented in Print.cpp:
size_t Print::print(const __FlashStringHelper *ifsh)
{
PGM_P p = reinterpret_cast<PGM_P>(ifsh);
size_t n = 0;
while (1) {
unsigned char c = pgm_read_byte(p++);
if (c == 0) break;
n += write(c);
}
return n;
}
You can see from there that only one byte of RAM is used at a time (plus a couple of variables), as it pulls the string from PROGMEM a byte at a time. These are all on the stack so there is no ongoing overhead.
I imagine it uses some RAM to iterate over the string and copy it from PROGMEM to RAM. I guess the question is: how much?
No, it doesn't as I showed above. It outputs a byte at a time. There is no copying (in bulk) of the string into RAM first.
Also, is heap fragmentation a concern here?
No, the code does not use the heap.

How can I perform 64-bit division with a 32-bit divide instruction?

This is (AFAIK) a specific question within this general topic.
Here's the situation:
I have an embedded system (a video game console) based on a 32-bit RISC microcontroller (a variant of NEC's V810). I want to write a fixed-point math library. I read this article, but the accompanying source code is written in 386 assembly, so it's neither directly usable nor easily modifiable.
The V810 has built-in integer multiply/divide, but I want to use the 18.14 format mentioned in the above article. This requires dividing a 64-bit int by a 32-bit int, and the V810 only does (signed or unsigned) 32-bit/32-bit division (which produces a 32-bit quotient and a 32-bit remainder).
So, my question is: how do I simulate a 64-bit/32-bit divide with a 32-bit/32-bit one (to allow for the pre-shifting of the dividend)? Or, to look at the problem from another way, what's the best way to divide an 18.14 fixed-point by another using standard 32-bit arithmetic/logic operations? ("best" meaning fastest, smallest, or both).
Algebra, (V810) assembly, and pseudo-code are all fine. I will be calling the code from C.
Thanks in advance!
EDIT: Somehow I missed this question... However, it will still need some modification to be super-efficient (it has to be faster than the floating-point div provided by the v810, though it may already be...), so feel free to do my work for me in exchange for reputation points ;) (and credit in my library documentation, of course).
GCC has such a routine for many processors, named _divdi3 (usually implemented using a common divmod call). Here's one. Some Unix kernels have an implementation too, e.g. FreeBSD.
If your dividend is unsigned 64 bits, your divisor is unsigned 32 bits, the architecture is i386 (x86), the div assembly instruction can help you with some preparation:
#include <stdint.h>
/* Returns *a % b, and sets *a = *a_old / b; */
uint32_t UInt64DivAndGetMod(uint64_t *a, uint32_t b) {
#ifdef __i386__ /* u64 / u32 division with little i386 machine code. */
uint32_t upper = ((uint32_t*)a)[1], r;
((uint32_t*)a)[1] = 0;
if (upper >= b) {
((uint32_t*)a)[1] = upper / b;
upper %= b;
}
__asm__("divl %2" : "=a" (((uint32_t*)a)[0]), "=d" (r) :
"rm" (b), "0" (((uint32_t*)a)[0]), "1" (upper));
return r;
#else
const uint64_t q = *a / b; /* Calls __udivdi3 in libgcc. */
const uint32_t r = *a - b * q; /* `r = *a % b' would use __umoddi3. */
*a = q;
return r;
#endif
}
If the line above with __udivdi3 doesn't compile for you, use the __div64_32 function from the Linux kernel: https://github.com/torvalds/linux/blob/master/lib/div64.c

Resources