Memory limits on Arduino - arduino

I have recently bought an Arduino Uno, and now I am experimenting a bit with it. I have a couple of 18B20 sensors and an ENC28J60 network module connected to it, then I am making a sketch to allow me to connect to it from a browser and read out the temperatures either as a simple web page or as JSON. The code that makes the web pages looks like this:
client.print("Inne: ");
client.print(tempin);
client.println("<br />");
client.print("Ute: ");
client.print(tempout);
client.print("<br /><br />");
client.println(millis()/1000);
// client.print("j");
The strange thing is: if I uncomment the last line, the sketch compiles fine, uploads fine, but I cannot get to connect to the board. The same thing happens if I add on some more characters in some of the other printouts. Thus, it looks to me as if I'm running into some kind of memory limit (the total size of the sketch is about 15 KB, and there are some other strings used elsewhere in the code - and yes I know, I will rewrite it to use an array to store the temporaries, I've just stolen some code from an example).
Is there any limit on how much memory I can use to store strings in an Arduino and are there any way to get around that? (using GUI v 1.0.1 on a Debian PC with GCC-AVR 4.3.5 and AVR Libc 1.6.8).

The RAM is rather small, as the UNO's 328 is only 2K. You may just be running out of RAM. I learned that when it runs out, it just kind of sits there.
I suggest reading the readme from this library to get the FreeRAM. It mentions how the ".print" can consume both RAM and ROM.
I always now use (for Arduino IDE 1.0.+)
Serial.print(F("HELLO"));
versus
Serial.print("HELLO");
as it saves RAM, and this should be true for lcd.print. Where I always put a
Serial.println(freeMemory(), DEC); // print how much RAM is available.
in the beginning of the code, and pay attention. Noting that there needs to be room to run the actual code and re-curse into its subroutines.
For IDE's prior to 1.0.0 the library provides getPSTR()).
IDE 1.0.3 now starts to display the expected usage of RAM at the end of the compile. However, I find it is often short, as it is only an estimate.
I also recommend that you look at Webduino as it has a library that supports JSON. Its examples are very quick to get going. However it does not directly support the ENC28J60.

I use the following code to get free available RAM
int getFreeRam()
{
extern int __heap_start, *__brkval;
int v;
v = (int) &v - (__brkval == 0 ? (int) &__heap_start : (int) __brkval);
Serial.print("Free RAM = ");
Serial.println(v, DEC);
return v;
}

You can check the memory usage with a small lib called memoryFree.
When there is ram left, you might be pushing the serial buffer limit instead of the ram limit. If so, you can increase SERIAL_BUFFER_SIZE in HardwareSerial.cpp
(C:\Program Files (x86)\Arduino\hardware\arduino\cores\arduino on a windows machine)
Be carefull though, ram and serial buffer are both stored on the SRAM. Increasing the serial buffer will result in less available memory for your variables.
For playing with JSON on the arduino there is a really nice lib, called aJson.

add this function and call it in setup and every now and then in your loop to make sure RAM is not being used up.
// Private function: from http://arduino.cc/playground/Code/AvailableMemory
int freeRam () {
extern int __heap_start, *__brkval;
int v;
return (int) &v - (__brkval == 0 ? (int) &__heap_start : (int) __brkval);
}
You need to call it for example inside a print:
Serial.println(freeRam());

Related

ESP32 - Best Way to Store Data Offline and Push when Network Connected

I am coding an offline, battery-powered esp32 to take periodic sensor readings and store them until a hotspot is found, in which it connects and pushes the data elsewhere. I am relatively new to esp32 and ask for suggestions on the best way to do this.
I was thinking of storing the reading and DateTime in SPIFFS memory and running a webserver that starts when a network is found, checking every minute or so. Since it is battery-powered, I would also like to deep sleep the board to save power. Does the setup() function run again when the board comes out of deep sleep or would I need to have my connectToWiFi function inside the loop?
Is this viable? And are there any better routes to take? I've seen things on asynchronous servers and using the esp32 as an access point that could maybe work. Is it best to download the file through a web server or send the file line by line through a free online database?
Deep sleep on the ESP32 is almost the equivalent of being power cycled - the CPU restarts, and any dynamic memory will have lost its contents. An Arduino program will enter setup() after deep sleep and will have to completely reinitialize everything the program needs to run.
There is a very small area (8Kbytes) of static memory associated with the real time clock (RTC) which is retained during deep sleep. You can directly reference variables stored there using a special decorator (RTC_DATA_ATTR) when you declare the variable.
For instance, you could use a variable stored in this area to count the number of times the CPU has slept and woken up.
RTC_DATA_ATTR uint64_t sleep_counter = 0;
void setup() {
sleep_counter++;
Serial.begin(115200);
Serial.print("ESP32 has woken up ");
Serial.print(sleep_counter);
Serial.println(" times");
}
Beware that it's generally not safe to store objects in this area - you don't necessarily know whether they've allocated memory that won't persist during deep sleep. So storing a String in this memory won't work. Also storing a struct with pointers generally won't work as the pointers won't point to storage in this area.
Also beware that if the ESP32 loses power, RTC_DATA_ATTR will be wiped out.
The RTC static RAM also has the advantage of not costing as much power to write to as SPIFFS.
If you need more storage than this, SPIFFS is certainly an option. Beware that ESP32's generally use cheap NAND flash memory which is rated for a maximum of maybe 100,000 writes.
SPIFFS performs wear-leveling, which will help avoid writing to the same location in flash over and over again, but eventually it will still wear out. This isn't a problem for most projects but suppose you're writing to SPIFFS once a minute for two years - that's over a million writes. So if you're looking for persistent storage that's frequently written to over a very long time you might want to use a better quality of flash storage like an external SD card.
If an SD card is not an option (beware you don’t pull out the SD while writing!) I would write to SPIFFS or a direct write with esp_partition_write().
For the latter: if you use predefined structs with you sensor data (plus time etc) and you start the partition with a small table with the startvalue which has to be updated next time until the other (a mini fat) it’s easy to retrieve data (no fuzz with reading lines). Keep in mind that every time you wipe the flash the wear counts for that whole block! So if you accept old data to be ignored but present, this could dramatically reduce wear.
For example say you write:
Struct:
Uint8_t day;
Uint8_t month;
Uint8_t year; //year minus 2000, max 256
Uint8_t hour;
Uint8_t minutes;
Uint8_t seconds;
Uint8_t sensorMSB;
Uint8_t sensorLSB;
That’s 8 bytes.
The first struct (call it the mini fat):
Uint8_t firstToProcessMSB;
Uint8_t firstToProcessLSB;
Uint8_t amountToProcessMSB;
Uint8_t amountToProcessLSB;
Uint8_t ID0;
Uint8_t ID1;
Uint8_t ID2;
Uint8_t ID3;
Also eight bytes. For the ID you can use some values to recognize a well written partition. But that’s up to you. It’s only a suggestion!
In an 65.536 byte sized partition you can add 8192 (-1) bytes before you have to erase. Max 100.000 times…
When your device makes contact you read out the first bytes. Check if it’s ok, read the start byte. With fseek you step 8 bytes every hop to the end position and read all values in one time until you reached the end. If succesfull, you change the startposition to the end + 1 and only erase when things go tricky. (Not enough space)
It’s wise to do before the longest suspected amount of time to run out. Otherwise you will also lose data. Or you just make the partition bigger.
But if in this case you could write every minute for 5 days.

How to make .print() macro use F() by default (Arduino)

I usually use base(hardware's UART) serial communication for debugging during development.
This means that most of the text snend to print() will not be send in a final product (limited to repair level only messages).
with mine debugging messages being of a considerable info with loot of tabs and stated variables (and their descriptions), I find that mine 1K5 lined project spends several times more RAM on debugging messages then on program itself.
With one letter being one byte 2000 letters is nothing.
Most of mine non-debugging serial communication (during development is using software serial Tx) and using write function, works with bytes themselves and does not send actual text. (currently mine serial communication routine function and structure uses 6 byte blocks including addressing).
To the point: I use Streaming.h to speed the text addition to serial sending.
It is annoying to keep putting Text strings into F() every single time
F() Function slows the operation of the device because rather then Globally wasting RAM it reads it from flash every time its used and without it mine debuging messages use too much SRAM (arduino loads them as a global variables)
Is there a way of making print() use F() function without editing the Wire.h library ? (which would block me from being able to automatically update the header files)
You should use F() to store you text in FLASH memory, you can't avoid this.
You can define the macro:
#define FPRINT(x) print(F(x))
Serial.FPRINT("text");
Or even like that:
#define SFPRINT(x) Serial.print(F(x))
SFPRINT("test");
Of course, you can replace FPRINT with anything you want, that is not pre-defined (in this case you will get compiler warning).
You can also use printf_P function from <stdio.h> and PSTR macro from <avr/pgmspace.h> (they should in included by default to you program in Arduino IDE).
Typical use with text stored in RAM:
int a = 5;
printf("This is my variable: %d", a);
Result:
Sketch uses 1946 bytes (6%) of program storage space.
Global variables use 39 bytes (1%) of dynamic memory
Use with text stored in FLASH:
int a = 5;
printf_P(PSTR ("This is my variable: %d"), a);
Result:
Sketch uses 1958 bytes (6%) of program storage space.
Global variables use 15 bytes (0%) of dynamic memory

Strange characters coming from Arduino when writing to SD card

I am connecting a SD card to an Arduino which is then communicating over serial to Visual studio. Everything works fine independently and 99% collectively. Now if i write this code in the setup in works fine. If i pop it into a function which is called when a specific character is sent from visual studio I get the strange characters at the bottom.
I have debugged each step of the code and nothing seems abnormal, unfortunately I cannot the code as
1) it's far too long...
2) it's confidential...
:(
I understand without code I cannot get a complete solution but what are those characters! why in the setup does it work perfectly and in a function I get all kinds of randomness?
myFile = SD.open("test.txt");
if (myFile) {
Serial.println("test.txt:");
// read from the file until there's nothing else in it:
while (myFile.available()) {
Serial.write(myFile.read());
}
// close the file:
myFile.close();
} else {
// if the file didn't open, print an error:
Serial.println("error opening test.txt");
}
}
整瑳湩⁧ⰱ㈠‬⸳ࠀ -- Copied straight from the text file
整瑳湩%E2%81%A7ⰱ㈠%E2%80%AC⸳ࠀ -- Output when pasted into google
Its the arduino DUE and yes lots of "String" including 4 x 2D string arrays we are using to upload to a tft screen. I ran into memory issues with the UNO but thought we would be ok with the DUE as its got considerably more ram?
Well, there's your problem. I like how you ended that with a question mark. :) Having extra RAM is no guarantee that the naughty String will behave, NOT EVEN ON SERVERS WITH GIGABYTES OF RAM. To summarize my Arduino forum post:
Don't use String™
The solutions using C strings (i.e., char arrays) are very efficient. If you're not sure how to convert a String usage to a character array, please post the snippet. You will get lots of help!
String will add at least 1600 bytes of FLASH to your program size and 10 bytes per String variable. Although String is fairly easy to use and understand, it will also cause random hangs and crashes as your program runs longer and/or grows in size and complexity. This is due to dynamic memory fragmentation caused by the heap management routines malloc and free.
Commentary from systems experts elsewhere on the net:
The Evils of Arduino Strings (required reading!)
Why is malloc harmful in embedded systems?
Dr Dobbs Journal
Memory Fragmentation in servers (MSDN)
Memory Fragmentation, your worst nightmare (nice graphics)
Another answer of mine.

OpenCL clBuildProgram caches source, and does not recompile if #include'd source changes

I have implemented a project with opencl. I have a file which contains the kernel function and the functions which are used by the kernel are included in a seperate header file but when I change the file which is included, sometimes the changes are applied and sometimes they are not and it makes me confused if the application has bug or not.
I checked the other posts in stackoverflow and see nvidia has serious problem with passing -I{include directory}, so I changed it and give the header files address explicitly, but still the opencl compiler is not able to find the errors in the header file which is included in the kernel file name.
Also, I am using nvidia gtx 980 and I have intalled CUDA 7.0 on my computer.
Anyone has the same experience? how can I fix it?
So, Assume I have a kernel like this:
#include "../../src/cl/test_kernel_include.cl"
void __kernel test_kernel(
__global int* result,
int n
)
{
int thread_idx = get_global_id(0);
result[thread_idx] = test_func();
}
which the test_kernel_include.cl is as follows:
int test_func()
{
return 1;
}
Then I run the code and I get an array which all the members are equal to 1 as we expect. Now, I change the test_kernel_include.cl to:
int test_func()
{
return 2;
}
but the result is still an array which all the members are equal to 1 which should change to 2 but they are not.
Do this before platform initialization:
setenv("CUDA_CACHE_DISABLE", "1", 1);
It will disable caching mechanism for the build.
It also works for the OpenCL platform, even though it says CUDA.
In order to improve kernel compilation times, NVIDIA implement a caching scheme, whereby a compiled kernel binary is stored to disk and loaded next time the same kernel is compiled. Some hash is computed on the kernel source code which is then used as the index into the compiled kernel cache.
Unfortunately, these hashes do not include any header files that are included by the main kernel source. This means that when you change something in an included header file, the driver will essentially ignore the change and reload the previous kernel binary from disk (unless something changed in the main kernel source as well).
On Linux systems, the kernel cache can be found in ~/.nv/ComputeCache. If you delete this directory after making a change to one of your include files, then it should force the driver to actually recompile the OpenCL kernel.

clEnqueueNDRange blocking on Nvidia hardware? (Also Multi-GPU)

On Nvidia GPUs, when I call clEnqueueNDRange, the program waits for it to finish before continuing. More precisely, I'm calling its equivalent C++ binding, CommandQueue::enqueueNDRange, but this shouldn't make a difference. This only happens on Nvidia hardware (3 Tesla M2090s) remotely; on our office workstations with AMD GPUs, the call is nonblocking and returns immediately. I don't have local Nvidia hardware to test on - we used to, and I remember similar behavior then, too, but it's a bit hazy.
This makes spreading the work across multiple GPUs harder. I've tried starting a new thread for each call to enqueueNDRange using std::async/std::finish in the new C++11 spec, but that doesn't seem to work either - monitoring the GPU usage in nvidia-smi, I can see that the memory usage on GPU 0 goes up, then it does some work, then the memory on GPU 0 goes down and the memory on GPU 1 goes up, that one does some work, etc. My gcc version is 4.7.0.
Here's how I'm starting the kernels, where increment is the desired global work size divided by the number of devices, rounded up to the nearest multiple of the desired local work size:
std::vector<cl::CommandQueue> queues;
/* Population of queues happens somewhere
cl::NDrange offset, increment, local;
std::vector<std::future<cl_int>> enqueueReturns;
int numDevices = queues.size();
/* Calculation of increment (local is gotten from the function parameters)*/
//Distribute the job among each of the devices in the context
for(int i = 0; i < numDevices; i++)
{
//Update the offset for the current device
offset = cl::NDRange(i*increment[0], i*increment[1], i*increment[2]);
//Start a new thread for each call to enqueueNDRangeKernel
enqueueReturns.push_back(std::async(
std::launch::async,
&cl::CommandQueue::enqueueNDRangeKernel,
&queues[i],
kernels[kernel],
offset,
increment,
local,
(const std::vector<cl::Event>*)NULL,
(cl::Event*)NULL));
//Without those last two casts, the program won't even compile
}
//Wait for all threads to join before returning
for(int i = 0; i < numDevices; i++)
{
execError = enqueueReturns[i].get();
if(execError != CL_SUCCESS)
std::cerr << "Informative error omitted due to length" << std::endl
}
The kernels definitely should be running on the call to std::async, since I can create a little dummy function, set a breakpoint on it in GDB and have it step into it the moment std::async is called. However, if I make a wrapper function for enqueueNDRangeKernel, run it there, and put in a print statement after the run, I can see that it takes some time between prints.
P.S. The Nvidia dev zone is down due to hackers and such, so I haven't been able to post the question there.
EDIT: Forgot to mention - The buffer that I'm passing to the kernel as an argment (and the one I mention, above, that seems to get passed between the GPUs) is declared as using CL_MEM_COPY_HOST_PTR. I had been using CL_READ_WRITE_BUFFER, with the same effect happening.
I emailed the Nvidia guys and actually got a pretty fair response. There's a sample in the Nvidia SDK that shows, for each device you need to create seperate:
queues - So you can represent each device and enqueue orders to it
buffers - One buffer for each array you need to pass to the device, otherwise the devices will pass around a single buffer, waiting for it to become available and effectively serializing everything.
kernel - I think this one's optional, but it makes specifying arguments a lot easier.
Furthermore, you have to call EnqueueNDRangeKernel for each queue in separate threads. That's not in the SDK sample, but the Nvidia guy confirmed that the calls are blocking.
After doing all this, I achieved concurrency on multiple GPUs. However, there's still a bit of a problem. On to the next question...
Yes, you're right. AFAIK - the nvidia implementation has a synchronous "clEnqueueNDRange". I have noticed this when using my library (Brahma) as well. I don't know if there is a workaround or a way of preventing this, save using a different implementation (and hence device).

Resources