Processing large amounts of arrays and memory exhaustion - arduino

First: I've inherited this project from someone who couldn't complete it due to time constraints.
The code contains a little over 100 declared arrays, each one containing a set of INTs. The arrays are all unique.
byte arr_foo[] = {2, 5, 6, 8, 3};
byte arr_bar[] = {1, 7};
byte arr_baz[] = {6, 10, 9, 11, 7, 8, 3};
Those INTs relate to a specific LED on a board - there are 11 total. And the arrays represent a specific sequence that these LEDs should light up.
What they were attempting to do was to write a routine that, when given an array name, would then go fetch the contents of that array and process the INTs. Well, passing an array name as a string to then be matched with a variable doesn't work. And this is where they passed it on, saying they don't have time to figure it out.
So I'm looking at this and thought, why not a 2-dimensional array? I quickly ran into trouble there.
byte seqs[][7] = {
{2, 5, 6, 8, 3},
{1, 7},
{6, 10, 9, 11, 7, 8, 3}
}
While in principle this works, the issue here is that it pads each array with trailing zeros because I told it each one has [7] elements. This results in a lot of memory being wasted and the thing running out of memory.
So I am stuck. I'm not sure how to deal with 100+ separate arrays, other than to write 100+ separate routines to be called later. Nor can I figure out how to make it more efficient.
Then there's the issue of, I may still run out of memory at a later time as more sequences are added. So then what? Add an external i2c flash memory, and shove things in there? Having never dealt with that, I'm not sure how to implement that, in what format to store the values, and how to do it. Am I correct that one has to first write a program that loads all the data in memory, upload that and run it, then put the actual program that's going to process that data on the micro controller?
So I guess I'm asking for two things: What's a better way of handling lots and lots of (small) arrays and be able to use them within a routine that calls them, and if I'm better off shoving this data into an external flash, what format should they be stored in?

Putting the data into 2D arrays wont save any space at all.
Right now, you're storing these values into your 2k of SRAM. Change these declarations to use the PROGMEM keyword, so they're stored where there's much more space.
Using the PROGMEM instructs the compiler to load this data into the flash portion of memory:
const PROGMEM uint8_t arr_foo[] = { 2, 5, 6, 8, 3 };
However the data needs to be accessed with a function call, you can't just use it directly.
for (byte k = 0; k < 5; k++)
{
uint8_t next_led = pgm_read_byte_near( arr_foo + k );
// Do something with next_led
}

If these arrays form a pattern of leds that should be lit, while the other ones are switched off, you could store the state of all leds in an uint16_t and have an array of those in PROGMEM. (As in Kingsley's answer)
If you're not familiar with HEX notation you could use the binary format.
const PROGMEM uint_16_t patterns[] = {
// BA9876543210 Led Pins
0b000101101100, //foo: 2, 5, 6, 8, 3
0b000010000010, //bar: 1, 7
0b111111001000, //baz: 6, 10, 9, 11, 7, 8, 3
// ...
};
I wonder about the order of your numbers, so I'm not sure if this guess is correct at all. So no more details how to work with this approach rigth now

Update
To me, your comments changed the intention of your question completely.
As I read it now, there's no need for a special kind of "name" data to identify an array. What you want seems to be just to pass different arrays around as function arguments.
This is usually done via pointers, and there are two things to note:
Most of the time, arrays "decay" to pointers automatically. That means that in most places an array variable can be used in place of a pointer. And a pointer can be used like an array.
An array in C does not carry any length information at runtime. The length of an array needs to be held/passed separately. Alternatively, you can define a struct (or class in C++) which contains both the array and its length as members.
Example:
If you want to pass an array of elements of type T to a function, you can declare the function to accept a pointer to T:
void somefunc(uint8_t* arr, uint8_t arrLength) {
for ( uint8_t i = 0; i < arrLength; i++ ) {
uint8_t value = arr[i];
value = *(arr+i); // equivalent to arr[i]
}
}
or equivalently
void somefunc(uint8_t arr[], uint8_t arrLength) {
...
}
then call that function by simply passing the array variable and the corresponding array's length, like
uint8_t arr_foo[] = { 1,2,3,4,5 };
uint8_t arr_bar[] = { 1,2 };
somefunc(arr_foo,5);
somefunc(arr_bar,2);
The arrays' constant data can be put into PROGMEM to save RAM, but, as others have noted, read accesses are a little more complex, requiring pgm_read_...() calls in C++. (AVR gcc does support __flash-qualified data only in C, not in C++.)
Then there's the issue of, I may still run out of memory at a later
time as more sequences are added.
Notice that the "Arduino" AVR has 32kb of flash memory. If each sequence consumes 15 bytes, it could probably still hold 1000 or 2000 of these items along with your program.
then what? Add an external i2c flash memory, and shove things in
there? Having never dealt with that, I'm not sure how to implement
that, in what format to store the values, and how to do it.
If you actually run out of flash at some point you can still resort to any form of external storage.
A common solution is SPI flash memory, which is readily available in the mega-bit range. Winbond is a well-known supplier. Just search for "Arduino SPI flash" modules and libraries.
A more complex approach would be to support SD cards as external memory. But probably not worth it unless you actually want to store gigabytes of data.
Am I correct that one has to first write a program that loads all the
data in memory, upload that and run it, then put the actual program
that's going to process that data on the micro controller?
That's definitely one way to do it. If your code space permits, you can alternatively include the routines to write to the external flash memory in your application, like some kind of bootloader so that you can switch into "upload external flash data" mode without re-flashing the microcontroller.

Related

Why use pointers and reference in codesys V3?

My question is: what are the benefits of using pointers and reference to?
I am new to codesys and in my previous job, I programmed in TIA portal (Siemens) and Sysmac Studio (Omron) and never came across pointers or something similar. I think I understand how they work but not sure when I should be using them myself.
For example, I just received a function block from a supplier:
Why don't they just have an array for input and output?
First of all, if you have ever used the VAR_IN_OUT declaration, then you have already used references, since that is equivalent to a VAR with REFERENCE TO.
As for the uses, there are mainly 4 that I can think of right now:
Type Punning, which you can also achieve using a UNION, but you may not want to have to create a union for every single reinterpretation cast in your code.
TL; DR: To save memory and copy execution time. Whenever you pass some data to a function/function block, it gets copied. This is not a big problem if your PLC has enough CPU power and memory, however if you are dealing with especially huge data on a low end PLC, then you may either exceed real time execution constraints, or run out of memory. When you pass a pointer/reference however, no matter how big the data is only the pointer/reference gets copied and passed, which is 4 bytes in 32 bit system, and 8 bytes in a 64 bit one.
In C style languages you'd use pointers/references when you want a function to return multiple values without the hassle of creating a custom structure every time. You can do the same here to, however in CODESYS function can have multiple outputs, for example:
VAR_OUPUT
out1 : INT; (*1st output variable *)
out2 : INT; (*2nd output variable *)
//...
END_VAR
And finally, as I mentioned at the very beginning, when you want to pass some data that needs to be modified in the function itself, in other words, where you can use VAR_IN_OUT you can also use pointers/references. One Special case where you will have to use a pointer is if you have a Function Block that receives some data in the FB_Init (initialization/construction) function and stores it locally. In such case you would have a pointer as a local variable in the function block, and take the address of the variable in the FB_Init function. Same applies if you have a structure that needs to reference another structure or some data.
PS. There are probably some other uses I missed. One of the main uses in other languages is for dynamic memory allocations, but in CODESYS this is disabled by default and not all PLCs support it, and hardly anyone (that I know) uses it.
EDIT: Though this has been accepted, I want to bring a real life example of us using pointers:
Suppose we want to have a Function Block that calculates the moving average on a given number series. A simple approach would be something like this:
FUNCTION_BLOCK MyMovingAvg
VAR_INPUT
nextNum: INT;
END_VAR
VAR_OUTPUT
avg: REAL;
END_VAR
VAR
window: ARRAY [0..50] OF INT;
currentIndex: UINT;
END_VAR
However, this has the problem that the moving window size is static and predefined. If we wanted to have averages for different window sizes we would either have to create several function blocks for different window sizes, or do something like this:
FUNCTION_BLOCK MyMovingAvg
VAR CONSTANT
maxWindowSize: UINT := 100;
END_VAR
VAR_INPUT
nextNum: INT;
windowSize: UINT (0..maxWindowSize);
END_VAR
VAR_OUTPUT
avg: REAL;
END_VAR
VAR
window: ARRAY [0..maxWindowSize] OF INT;
currentIndex: UINT;
END_VAR
where we would only use the elements of the array from 0 to windowSize and the rest would be ignored. This however also has the problems that we can't use window sizes more than maxWindowSize and there's potentially a lot of wasted memory if maxWindowSize is set high.
There are 2 ways to get a truly general solution:
Use dynamic allocations. However, as I mentioned previously, this isn't supported by all PLCs, is disabled by default, has drawbacks (you'll have to split you memory into two chunks), is hardly used and is not very CODESYS-like.
Let the user define the array of whatever size they want and pass the array to our function block:
FUNCTION_BLOCK MyMovingAvg
VAR_INPUT
nextNum: INT;
END_VAR
VAR_OUTPUT
avg: REAL;
END_VAR
VAR
windowPtr: POINTER TO INT;
windowSize: DINT;
currentIndex: UINT;
END_VAR
METHOD FB_Init: BOOL
VAR_INPUT
bInitRetains: BOOL;
bInCopyCode: BOOL;
END_VAR
VAR_IN_OUT // basically REFERENCE TO
window_buffer: ARRAY [*] OF INT; // array of any size
END_VAR
THIS^.windowPtr := ADR(window_buffer);
THIS^.windowSize := UPPER_BOUND(window_buffer, 1) - LOWER_BOUND(window_buffer, 1) + 1;
// usage:
PROGRAM Main
VAR
avgWindow: ARRAY [0..123] OF INT; // whatever size you want!
movAvg: MyMovingAvg(window_buffer := avgWindow);
END_VAR
movAvg(nextNum := 5);
movAvg.avg;
The same principle can be applied to any function block that operates on arrays (for example, we also use it for sorting). Moreover, similarly you may want to have a function that works on any integer/floating number. For that you may use one of the ANY types which is basically a structure that holds a pointer to the first byte of the data, the size of the data (in bytes) and a type enum.

OpenCL - Storing a large array in private memory

I have a large array of float called source_array with the size of around 50.000. I am current trying to implement a collections of modifications on the array and evaluate it. Basically in pseudo code:
__kernel void doSomething (__global float *source_array, __global boolean *res. __global int *mod_value) {
// Modify values of source_array with mod_value;
// Evaluate the modified array.
}
So in the process I would need to have a variable to hold modified array, because source_array should be a constant for all work item, if i modify it directly it might interfere with another work item (not sure if I am right here).
The problem is the array is too big for private memory therefore I can't initialize in kernel code. What should I do in this case ?
I considered putting another parameter into the method, serves as place holder for modified array, but again it would intefere with another work items.
Private "memory" on GPUs literally consists of registers, which generally are in short supply. So the __private address space in OpenCL is not suitable for this as I'm sure you've found.
Victor's answer is correct - if you really need temporary memory for each work item, you will need to create a (global) buffer object. If all work items need to independently mutate it, it will need a size of <WORK-ITEMS> * <BYTES-PER-ITEM> and each work-item will need to use its own slice of the buffer. If it's only temporary, you never need to copy it back to host memory.
However, this sounds like an access pattern that will work very inefficiently on GPUs. You will do much better if you decompose your problem differently. For example, you may be able to make whole work-groups coordinate work on some subrange of the array - copy the subrange into local (group-shared) memory, the work is divided between the work items in the group, and the results are written back to global memory, and the next subrange is read to local, etc. Coordinating between work-items in a group is much more efficient than each work item accessing a huge range of global memory We can only help you with this algorithmic approach if you are more specific about the computation you are trying to perform.
Why not to initialize this array in OpenCL host memory buffer. I.e.
const size_t buffer_size = 50000 * sizeof(float);
/* cl_malloc, malloc or new float [50000] or = {0.1f,0.2f,...} */
float *host_array_ptr = (float*)cl_malloc(buffer_size);
/*
put your data into host_array_ptr hear
*/
cl_int err_code;
cl_mem my_array = clCreateBuffer( my_cl_context, CL_MEM_READ_WRITE | CL_MEM_USE_HOST_PTR, buffer_size, host_array_ptr, &err_code );
Then you can use this cl_mem my_array in OpenCL kernel
Find out more

Arduino Zero - Region Ram overflowed with stack

I have some code that uses nested Structs to store device parameters see below:
This is using an Ardunio Zero ( Atmel SAMD21)
The declares Storeage with up to 3 networks each network with 64 devices.
I would like to use 5 networks however when I increase the networks to 4 the code will not compile.
I get region RAM overflowed with stack / RAM overflowed by 4432 bytes.
I understand that this is taking more ram then I have? I am looking to see if there is a solution using a different method to achieve the same thing but get it to fit?
struct device {
int stat;
bool changed;
char data[51];
char state[51];
char atime[14];
char btime[14];
};
struct outputs {
device fitting[64];
};
struct storage {
int deviceid =0;
int addstore =0;
bool set;
bool run_events = false;
char authkey[10];
outputs network[3];
} ;
storage data_store;
Well, the usual approches are:
Consider if all or any of the data is actually read-only, and thus can be made const (which should move it to read-only memory, if that fails you can usually force it by adding compiler-specific magic).
Figure out means of representing the data using fewer bits. For instance using 14 bytes for each of three timestamps might seem excessive; switching these to 32-bit timestamps and generating the strings when needed would save around 70%.
If there are duplicates, then perhaps each storage doesn't need three unique outputs, but can instead store pointers into a shared "pool" of unique configurations.
If not all 64 fittings are used, that array could also be refactored into having non-constant length.
It's hard to be more specific since I don't know your data or application well enough.
Your struct is taking too much place. That's all. Assuming chars, ints and bools are internally 1 byte each, your device struct takes 132 bytes. Then, your outputs struct takes 8448 bytes or 8.25Kb. Your unit has 32Kb of RAM...

Write to the same file from different MPI processes

I have some MPI processes which should write to the same file after they finish their task. The problem is that the length of the results is variable and I cannot assume that each process will write at a certain offset.
A possible approach would be to open the file in every process, to write the output at the end and then to close the file. But this way a race condition could occur.
How can I open and write to that file so that the result would be the expected one?
You might think you want the shared file or ordered mode routines. But these routines get little use and so are not well optimized (so they get little use... quite the cycle...)
I hope you intend on doing this collectively. then you can use MPI_SCAN to collect the offsets, then call MPI_FILE_WRITE_AT_ALL to have the MPI library optimize the I/O for you.
(If you are doing this independently, then you will have to do something like... master slave? passing a token? fall back to the shared file pointer routines even though I hate them?)
Here's an approach for a good collective method:
incr = (count*datatype_size);
/* you can skip this call and assume 'offset' is zero if you don't care
about the contents of the file */
MPI_File_get_position(mpi_fh, &offset);
MPI_Scan(&incr, &new_offset, 1, MPI_LONG_LONG_INT,
MPI_SUM, MPI_COMM_WORLD);
new_offset -= incr;
new_offset += offset;
ret = MPI_File_write_at_all(mpi_fh, new_offset, buf, count,
datatype, status);

When should I use MPI_Datatype instead of serializing manualy?

It's all begun when I needed to MPI_Bcast a 64 bit integer. Since MPI does not know how to handle it I did:
template<typename T>
inline int BcastObjects(T* pointer,
int count,
int root,
MPI_Comm comm)
{
return MPI_Bcast(pointer,
count * sizeof(*pointer),
MPI_BYTE,
root,
comm);
}
Now I can do:
int64_t i = 0;
BcastObjects(&i, 1, root_rank, some_communicator);
Then I started to use BcastObjects to send over an array of structures. I wonder if it's OK to do that?
The manuals about MPI_Datatype focus on how to do it, but not on why would I want to do it.
Why not just use MPI_INT64_T?
You can always mock up your own datatypes with MPI_Byte or what have you; the datatype stuff is there so that you don't have to. And in many cases it's much easier; if you want to send data that has "holes" in it (eg, a slice of a multidimensional array, data in a structure that has gaps), you can map that out fairly straighforwardly with a datatype, whereas you'd have to manually count out byte strings and use something like MPI_Pack otherwise. And of course describing the data at a higher level is certainly less brittle if something in your data structure changes.

Resources