PROGMEM array consuming RAM - arduino

I have defined (globally) a large array of strings thus:
const String opCodes[256]PROGMEM = {""...""}; // all 256 defined
However, building this now consumes 20% more RAM than it did before I added the array.
This was unexpected. Why did it happen? Thanks

The Arduino String object is a dynamic string much like std::string. And as such stores its data in dynamically allocated memory in RAM.
If you want to store the actual string data itself in PROGMEM then the Arduino PROGMEM reference will tell you how do do it using actual arrays of characters instead. In short, create arrays of characters stored in PROGMEM, and then make an array of const char * (also in PROGMEM) pointing to the strings.

In the end I decided not to use PROGMEM as its use seems a bit suspect.
A usable workaround was to use the F() function instead. This works.

Related

How should I translate char dbcc_name[1] to Delphi?

I have a problem to translate the following C++ code to Delphi.
This is the code:
char dbcc_name[1]
And this is what I think what it should be:
dbcc_name : array [0..0] of Char;
However, I know this field should return a name, and not just one character.
So, it maybe something like this:
dbcc_name: array of Char;
Now, this looks nice, but there's no way of predicting how long the name will be, and it will probably return something with a load of rubish and somewhere a #0 terminator in it, but that is -I think- not the proper way.
Would it not be wise to use a pointer to this array?
Like:
dbcc_name: PChar;
Thank you in advance.
You were right the first time, only with the wrong data type. Use AnsiChar instead, which is char in C/C++:
dbcc_name: array[0..0] of AnsiChar;
In Delphi 2009+, Char is an alias for WideChar, which in C/C++ is wchar_t on Windows and char16_t on other platforms.
That being said, in C/C++, it makes sense for a 1-element array to exist in a struct when it represents variable-length data, and is the last field in the struct. In this case, the struct usually exists inside of a larger block of allocated memory. There is no array bounds checking in C/C++, the contents of an array can exceed the bounds of the array as long as it doesn't exceed the bounds of the memory that the array is allocated in. Referring to an array by name decays into a pointer to the first element. It is very common in C to exploit this to define a struct that has variable-length data embedded directly inside of it, that can be referred to by name, without having to allocate the data elsewhere in memory. This is especially useful in embedded systems with limited memory.
There are several structs in the Win32 API that use this approach for variable-length data. Raymond Chen discusses this in more detail on his blog:
Why do some structures end with an array of size 1?
You will most likely use either array[0..0] of char or just char, with some caveats. The code below assumes you are using a Windows API and I make assumptions based on one specific Windows Message record that matches your description.
If you are using char dbcc_name[1] as defined in DEV_BROADCAST_DEVICEINTERFACE in C its a char in the DEV_BROADCAST_DEVICEINTERFACE_A structure, but a wchar_t in the DEV_BROADCAST_DEVICEINTERFACE_W structure. NOTE: char in C maps to AnsiChar in Delphi and wchar_t maps to char in Delphi.
With the W strucutre I declare this in Delphi as dbcc_name: char; to read, I simply use PChar(#ARecordPtr^.dbcc_name). Your C++ sample seemingly uses the A struct, a straight translation to Delphi would mean a using the A structure with AnsiChar and using PAnsiChar to read, just replace in the code above.
However, a new Delphi project will by default use the Unicode version (or W imports) of a Windows API so that is why my sample below is written for Unicode.
In my implementation I simply have it defined as char. Some developers like the array[0..0] of char syntax because it leaves a clue of variable length array at that position. It is a more accurate translation, but I find it adds little value.
Example:
PDEV_BROADCAST_DEVICEINTERFACE = ^DEV_BROADCAST_DEVICEINTERFACE;
DEV_BROADCAST_DEVICEINTERFACE = record
dbcc_size: DWORD;
dbcc_devicetype: DWORD; // = DBT_DEVTYP_DEVICEINTERFACE
dbcc_reserved: DWORD;
dbcc_classguid: TGUID;
dbcc_name: Char; // <--- [HERE IT IS]. Use AnsiChar is using the A record instead of the W Record
end;
and to use it
procedure TFoo.WMDeviceChange(var AMessage: TMessage);
var
LUsbDeviceName: string;
LPDeviceBroadcastHeader: PDEV_BROADCAST_HDR;
LPBroadcastDeviceIntf: PDEV_BROADCAST_DEVICEINTERFACE;
begin
if (AMessage.wParam = DBT_DEVICEARRIVAL) then
begin
LPDeviceBroadcastHeader := PDEV_BROADCAST_HDR(AMessage.LParam);
if LPDeviceBroadcastHeader^.dbch_devicetype = DBT_DEVTYP_DEVICEINTERFACE then
begin
LPBroadcastDeviceIntf := PDEV_BROADCAST_DEVICEINTERFACE(LPDeviceBroadcastHeader);
LUsbDeviceName := PChar(#LPBroadcastDeviceIntf^.dbcc_name); // <--- [HERE IT IS USED] Use PAnsiChar if using the A Record instead of the W Record
...
end;
end;
end;
See more in my post on pointers and structures and for more explanation on the odd use of a single character array see the the "Records with Variable Length Arrays" section in my post on arrays and pointer math.

OpenCL - Storing a large array in private memory

I have a large array of float called source_array with the size of around 50.000. I am current trying to implement a collections of modifications on the array and evaluate it. Basically in pseudo code:
__kernel void doSomething (__global float *source_array, __global boolean *res. __global int *mod_value) {
// Modify values of source_array with mod_value;
// Evaluate the modified array.
}
So in the process I would need to have a variable to hold modified array, because source_array should be a constant for all work item, if i modify it directly it might interfere with another work item (not sure if I am right here).
The problem is the array is too big for private memory therefore I can't initialize in kernel code. What should I do in this case ?
I considered putting another parameter into the method, serves as place holder for modified array, but again it would intefere with another work items.
Private "memory" on GPUs literally consists of registers, which generally are in short supply. So the __private address space in OpenCL is not suitable for this as I'm sure you've found.
Victor's answer is correct - if you really need temporary memory for each work item, you will need to create a (global) buffer object. If all work items need to independently mutate it, it will need a size of <WORK-ITEMS> * <BYTES-PER-ITEM> and each work-item will need to use its own slice of the buffer. If it's only temporary, you never need to copy it back to host memory.
However, this sounds like an access pattern that will work very inefficiently on GPUs. You will do much better if you decompose your problem differently. For example, you may be able to make whole work-groups coordinate work on some subrange of the array - copy the subrange into local (group-shared) memory, the work is divided between the work items in the group, and the results are written back to global memory, and the next subrange is read to local, etc. Coordinating between work-items in a group is much more efficient than each work item accessing a huge range of global memory We can only help you with this algorithmic approach if you are more specific about the computation you are trying to perform.
Why not to initialize this array in OpenCL host memory buffer. I.e.
const size_t buffer_size = 50000 * sizeof(float);
/* cl_malloc, malloc or new float [50000] or = {0.1f,0.2f,...} */
float *host_array_ptr = (float*)cl_malloc(buffer_size);
/*
put your data into host_array_ptr hear
*/
cl_int err_code;
cl_mem my_array = clCreateBuffer( my_cl_context, CL_MEM_READ_WRITE | CL_MEM_USE_HOST_PTR, buffer_size, host_array_ptr, &err_code );
Then you can use this cl_mem my_array in OpenCL kernel
Find out more

Arduino Zero - Region Ram overflowed with stack

I have some code that uses nested Structs to store device parameters see below:
This is using an Ardunio Zero ( Atmel SAMD21)
The declares Storeage with up to 3 networks each network with 64 devices.
I would like to use 5 networks however when I increase the networks to 4 the code will not compile.
I get region RAM overflowed with stack / RAM overflowed by 4432 bytes.
I understand that this is taking more ram then I have? I am looking to see if there is a solution using a different method to achieve the same thing but get it to fit?
struct device {
int stat;
bool changed;
char data[51];
char state[51];
char atime[14];
char btime[14];
};
struct outputs {
device fitting[64];
};
struct storage {
int deviceid =0;
int addstore =0;
bool set;
bool run_events = false;
char authkey[10];
outputs network[3];
} ;
storage data_store;
Well, the usual approches are:
Consider if all or any of the data is actually read-only, and thus can be made const (which should move it to read-only memory, if that fails you can usually force it by adding compiler-specific magic).
Figure out means of representing the data using fewer bits. For instance using 14 bytes for each of three timestamps might seem excessive; switching these to 32-bit timestamps and generating the strings when needed would save around 70%.
If there are duplicates, then perhaps each storage doesn't need three unique outputs, but can instead store pointers into a shared "pool" of unique configurations.
If not all 64 fittings are used, that array could also be refactored into having non-constant length.
It's hard to be more specific since I don't know your data or application well enough.
Your struct is taking too much place. That's all. Assuming chars, ints and bools are internally 1 byte each, your device struct takes 132 bytes. Then, your outputs struct takes 8448 bytes or 8.25Kb. Your unit has 32Kb of RAM...

Specifying multiple address spaces for a type in the list of arguments of function

I wrote a function in OpenCL:
void sort(int* array, int size)
{
}
and I need to call the function once over a __private array and once over a __global array. Apparently, it's not allowed in OpenCL to specify multiple address spaces for a type. Therefore, I should duplicate the declaration of function, while they have exactly the same body:
void sort_g(__global int* array, int size)
{
}
void sort_p(__private int* array, int size)
{
}
This is very inefficient for maintaining the code and I am wondering if there is a better way to manage multiple address spaces in OpenCL or not?
P.S.: I don't see why OpenCL doesn't allow multiple address spaces for a type. Compiler could generate multiple instances of the function (one per address space) and use them appropriately once they're called in the kernel.
For OpenCL < 2.0, this is how the language is designed and there is no getting around it, regrettably.
For OpenCL >= 2.0, with the introduction of the generic address space, your first piece of code works as you would expect.
In short, upgrading to 2.0 would solve your problem (and bring in other niceties), otherwise you're out of luck (you could perhaps wrap your function in a macro, but ew, macros).

Sharing Pointers Between Multiple Forked Processes

If I want to share something like a char **keys array between fork()'d processes using shm_open and mmap can I just stick a pointer to keys into a shared memory segment or do I have to copy all the data in keys into the shared memory segment?
All data you want to share has to be in the shared segment. This means that both the pointers and the strings have to be in the shared memory.
Sharing something which includes pointers can be cumbersome. This is because mmap doesn't guarantee that a given mapping will end up in the required address.
You can still do this, by two methods. First, you can try your luck with mmap and hope that the dynamic linker doesn't load something at your preferred address.
Second method is to use relative pointers. Inside a pointer, instead of storing a pointer to a string, you store the difference between the address of the pointer and the address of the string. Like so:
char **keys= mmap(NULL, ...);
char *keydata= (char*) keys + npointers * sizeof(char*);
strcpy(keydata, firstring);
keys[0]= (char*) (keydata - (char*) &keys[0]);
keydata+= strlen(firststring)+1;
When you want to access the string from the other process, you do the reverse:
char **keys= mmap(NULL, ...);
char *str= (char*) (&keys[0]) + (ptrdiff_t) keys[0];
It's a little cumbersome but it works regardless of what mmap returns.

Resources