Periodically I program sloppily. Ok, I program sloppily all the time, but sometimes that catches up with me in the form of out of memory errors. I start exercising a little discipline in deleting objects with the rm() command and things get better. I see mixed messages online about whether I should explicitly call gc() after deleting large data objects. Some say that before R returns a memory error it will run gc() while others say that manually forcing gc is a good idea.
Should I run gc() after deleting large objects in order to ensure maximum memory availability?
"Probably." I do it too, and often even in a loop as in
cleanMem <- function(n=10) { for (i in 1:n) gc() }
Yet that does not, in my experience, restore memory to a pristine state.
So what I usually do is to keep the tasks at hand in script files and execute those using the 'r' frontend (on Unix, and from the 'littler' package). Rscript is an alternative on that other OS.
That workflow happens to agree with
workflow-for-statistical-analysis-and-report-writing
tricks-to-manage-the-available-memory-in-an-r-session
which we covered here before.
From the help page on gc:
A call of 'gc' causes a garbage
collection to take place. This will
also take place automatically without
user intervention, and the primary
purpose of calling 'gc' is for the
report on memory usage.
However, it can be useful to call 'gc'
after a large object has been removed,
as this may prompt R to return memory
to the operating system.
So it can be useful to do, but mostly you shouldn't have to. My personal opinion is that it is code of last resort - you shouldn't be littering your code with gc() statements as a matter of course, but if your machine keeps falling over, and you've tried everything else, then it might be helpful.
By everything else, I mean things like
Writing functions rather than raw scripts, so variables go out of scope.
Emptying your workspace if you go from one problem to another unrelated one.
Discarding data/variables that you aren't interested in. (I frequently receive spreadsheets with dozens of uninteresting columns.)
Supposedly R uses only RAM. That's just not true on a Mac (and I suspect it's not true on Windows either.) If it runs out of RAM, it will start using virtual memory. Sometimes, but not always, processes will 'recognize' that they need to run gc() and free up memory. When they do not do so, you can see this by using the ActivityMonitor.app and seeing that all the RAM is occupied and disk access has jumped up. I find that when I am doing large Cox regression runs that I can avoid spilling over into virtual memory (with slow disk access) by preceding calls with gc(); cph(...)
A bit late to the party, but:
Explicitly calling gc will free some memory "now". ...so if other processes need the memory, it might be a good idea. For example before calling system or similar. Or perhaps when you're "done" with the script and R will sit idle for a while until the next job arrives - again, so that other processes get more memory.
If you just want your script to run faster, it won't matter since R will call it later if it needs to. It might even be slower since the normal GC cycle might never have needed to call it.
...but if you want to measure time for instance, it is typically a good idea to do a GC before running your test. This is what system.time does by default.
UPDATE As #DWin points out, R (or C#, or Java etc) doesn't always know when memory is low and the GC needs to run. So you could sometimes need to do GC as a work-around for deficiencies in the memory system.
No. If there is not enough memory available for an operation, R will run gc() automatically.
"Maybe." I don't really have a definitive answer. But the help file suggests that there are really only two reasons to call gc():
You want a report of memory usage.
After removing a large object, "it may prompt R to return memory to the operating system."
Since it can slow down a large simulation with repeated calls, I have tended to only do it after removing something large. In other words, I don't think that it makes sense to systematically call it all the time unless you have good reason to.
Related
UPDATE
To better clarify, my question is not if I'm doing the code right or not, I've already understood after the profiling that I wasn't.
The question is: Are you supposed to observe SBCL taking 100%CPU after running a program no matter what you did good or bad? And, is this something that you guys have seen happening before? - I.e. a known bug?
I'd give a reproducible example if I could, but this CPU hogging only happens sometimes (and I've never used multithreading constructs anywhere).
Sorry for not being more clear the first time around :)
-----
Bug?
I'm having occasional issues with Lisp using 100% CPU for long periods of time after running programs.
Update: Right now it was using 100% CPU for 40 minutes after the program had finished computation.
Environment: SBCL, rowswell, emacs+SLIME
My question is if this is a known bug in Common Lisp that I'm not aware of and might be related to GC?
Context
It's not the first time that it happens "randomly", but it has happened that more computationally heavy programs that do a lot of memory allocation end up using 100% for a long time (40min in this case) after the program finished.
The routine is single-threaded, thus there's no possibility of some task still running in the background.
I don't believe it's normal for SBCL to spend 40min after a program runs using 100% CPU. I'm afraid this might be related to some bug in GC?
I then profiled the program in SLIME:
and the program was super slow (~20min execution) and did a lot of allocations, then changed one line, and it now takes 2s to run, just because I was always formatting a debug string to an empty stream (thus generating new string representations of a list with 100k integers at each call):
(https://github.com/AlbertoEAF/advent_of_code_2019/commit/b37797df772c12c2d409b1c3356cf5b690c8f928)
That is not my point though. Even though this case is extremely ill-posed, the task I'm doing is very simple, and thus the program I'm using is irrelevant, the concern is the unstability of the platform, in scenarios where one is using sustained heavy computation and allocation. Are there reports of any issues like this with SLIME/SBCL or some other thing I'm not aware of?
Thank you!
The reason your change improves performance is that debug-stream is NIL.
In the old code you evaluate:
(format nil ...)
When you give nil as the stream to format, it prints to a string so you are doing the formatting work and allocating a big string you throw away.
In the new code you do:
(when nil ...)
Which costs approximately 0.
Note that nil does not mean do nothing when you pass it to format. In general if you want to do nothing you should do nothing instead of calling functions that do things.
Periodically I program sloppily. Ok, I program sloppily all the time, but sometimes that catches up with me in the form of out of memory errors. I start exercising a little discipline in deleting objects with the rm() command and things get better. I see mixed messages online about whether I should explicitly call gc() after deleting large data objects. Some say that before R returns a memory error it will run gc() while others say that manually forcing gc is a good idea.
Should I run gc() after deleting large objects in order to ensure maximum memory availability?
"Probably." I do it too, and often even in a loop as in
cleanMem <- function(n=10) { for (i in 1:n) gc() }
Yet that does not, in my experience, restore memory to a pristine state.
So what I usually do is to keep the tasks at hand in script files and execute those using the 'r' frontend (on Unix, and from the 'littler' package). Rscript is an alternative on that other OS.
That workflow happens to agree with
workflow-for-statistical-analysis-and-report-writing
tricks-to-manage-the-available-memory-in-an-r-session
which we covered here before.
From the help page on gc:
A call of 'gc' causes a garbage
collection to take place. This will
also take place automatically without
user intervention, and the primary
purpose of calling 'gc' is for the
report on memory usage.
However, it can be useful to call 'gc'
after a large object has been removed,
as this may prompt R to return memory
to the operating system.
So it can be useful to do, but mostly you shouldn't have to. My personal opinion is that it is code of last resort - you shouldn't be littering your code with gc() statements as a matter of course, but if your machine keeps falling over, and you've tried everything else, then it might be helpful.
By everything else, I mean things like
Writing functions rather than raw scripts, so variables go out of scope.
Emptying your workspace if you go from one problem to another unrelated one.
Discarding data/variables that you aren't interested in. (I frequently receive spreadsheets with dozens of uninteresting columns.)
Supposedly R uses only RAM. That's just not true on a Mac (and I suspect it's not true on Windows either.) If it runs out of RAM, it will start using virtual memory. Sometimes, but not always, processes will 'recognize' that they need to run gc() and free up memory. When they do not do so, you can see this by using the ActivityMonitor.app and seeing that all the RAM is occupied and disk access has jumped up. I find that when I am doing large Cox regression runs that I can avoid spilling over into virtual memory (with slow disk access) by preceding calls with gc(); cph(...)
A bit late to the party, but:
Explicitly calling gc will free some memory "now". ...so if other processes need the memory, it might be a good idea. For example before calling system or similar. Or perhaps when you're "done" with the script and R will sit idle for a while until the next job arrives - again, so that other processes get more memory.
If you just want your script to run faster, it won't matter since R will call it later if it needs to. It might even be slower since the normal GC cycle might never have needed to call it.
...but if you want to measure time for instance, it is typically a good idea to do a GC before running your test. This is what system.time does by default.
UPDATE As #DWin points out, R (or C#, or Java etc) doesn't always know when memory is low and the GC needs to run. So you could sometimes need to do GC as a work-around for deficiencies in the memory system.
No. If there is not enough memory available for an operation, R will run gc() automatically.
"Maybe." I don't really have a definitive answer. But the help file suggests that there are really only two reasons to call gc():
You want a report of memory usage.
After removing a large object, "it may prompt R to return memory to the operating system."
Since it can slow down a large simulation with repeated calls, I have tended to only do it after removing something large. In other words, I don't think that it makes sense to systematically call it all the time unless you have good reason to.
I was wondering if it is possible to create a programming language without explicit memory allocation/deallocation (like C, C++ ...) AND without garbage collection (like Java, C#...) by doing a full analysis at the end of each scope?
The obvious problem is that this would take some time at the end of each scope, but I was wondering if it has become feasible with all the processing power and multiple cores in current CPU's. Do such languages exist already?
I also was wondering if a variant of C++ where smart pointers are the only pointers that can be used, would be exactly such a language (or am I missing some problems with that?).
Edit:
Well after some more research apparently it's this: http://en.wikipedia.org/wiki/Reference_counting
I was wondering why this isn't more popular. The disadvantages listed there don't seem quite serious, the overhead should be that large according to me. A (non-interpreted, properly written from the ground up) language with C family syntax with reference counting seems like a good idea to me.
The biggest problem with reference counting is that it is not a complete solution and is not capable of collecting a cyclic structure. The overhead is incurred every time you set a reference; for many kinds of problems this adds up quickly and can be worse than just waiting for a GC later. (Modern GC is quite advanced and awesome - don't count it down like that!!!)
What you are talking about is nothing special, and it shows up all the time. The C or C++ variant you are looking for is just plain regular C or C++.
For example write your program normally, but constrain yourself not to use any dynamic memory allocation (no new, delete, malloc, or free, or any of their friends, and make sure your libraries do the same), then you have that kind of system. You figure out in advance how much memory you need for everything you could do, and declare that memory statically (either function level static variables, or global variables). The compiler takes care of all the accounting the normal way, nothing special happens at the end of each scope, and no extra computation is necessary.
You can even configure your runtime environment to have a statically allocated stack space (this one isn't really under the compiler's control, more linker and operating system environment). Just figure out how deep your function call chain goes, and how much memory it uses (with a profiler or similar tool), an set it in your link options.
Without dynamic memory allocation (and thus no deallocation through either garbage collection or explicit management), you are limited to the memory you declared when you wrote the program. But that's ok, many programs don't need dynamic memory, and are already written that way. The real need for this shows up in embedded and real-time systems when you absolutely, positively need to know exactly how long an operation will take, how much memory (and other resources) it will use, and that the running time and the use of those resources can't ever change.
The great thing about C and C++ is that the language requires so little from the environment, and gives you the tools to do so much, that smart pointers or statically allocated memory, or even some special scheme that you dream up can be implemented. Requiring the use them, and the constraints you put on yourself just becomes a policy decision. You can enforce that policy with code auditing (use scripts to scan the source or object files and don't permit linking to the dynamic memory libraries)
The current way our application is written, involves creating a new connection and command object in every method that access our sqlite db. Considering we need it to run on a WM5 device, that is leading to hideous performance.
Our plan is to use just one connection object per-thread but it's also occurred to us to use one global command object per-thread too. The benefit of this is it reduces the overhead on the garbage collector created by instantiating objects all over the place.
I can't find any advice against doing this but wondered if anyone can answer definitively if this is a good or bad thing to do, and why?
While I'm not sure about reducing the number of command objects, reducing the number of connections is definitely a good plan. They're designed to be relatively expensive to set up (hey, they involve actually opening a disk file!) so keeping them around for a relatively long time is highly sensible. So do the first stage of your plan and retime to see if that makes things good enough, or if you need to do more work optimizing…
Note that it is quite possible that generating the command objects once per connection will be a saving too, since that will allow them to be compiled once and reused multiple times. Not that that matters until you're persisting the connection in the first place!
My problem is:
I have a perl script which uses lot of memory (expected behaviour because of caching). But, I noticed that the more I do caching, slower it gets and the process spends most of the time in sleep mode.
I thought pre-allocating memory to the process might speed up the performance.
Does someone have any ideas here?
Update:
I think I am not being very clear here. I will put question in clearer way:
I am not looking for the ways of pre-allocating inside the perl script. I dont think that would help me much here. What I am interested in is a way to tell OS to allocate X amount of memory for my perl script so that it does not have to compete with other processes coming in later.
Assume that I cant get away with the memory usage. Although, I am exploring ways of reducing that too but dont expect much improvement there.
FYI, I am working on a solaris 10 machine.
What I gathered from your posting and comments is this:
Your program gets slow when memory use rises
Your pogram increasingly spends time sleeping, not computing.
Most likely eplanation: Sleeping means waiting for a resource to become available. In this case the resource most likely is memory. Use the vmstat 1 command to verify. Have a look at the sr column. If it goes beyond ~150 consistently the system is desperate to free pages to satisfy demand. This is accompanied by high activity in the pi, po and fr columns.
If this is in fact the case, your best choices are:
Upgrade system memory to meet demand
Reduce memory usage to a level appropiate for the system at hand.
Preallocating memory will not help. In either case memory demand will exceed the available main memory at some point. The kernel will then have to decide which pages need to be in memory now and which pages may be cleared and reused for the more urgently needed pages. If all regularily needed pages (the working set) exceeds the size of main memory, the system is constantly moving pages from and to secondary storage (swap). The system is then said to be thrashing and spends not much time doing useful work. There is nothing you can do about this execept adding memory or using less of it.
From a comment:
The memory limitations are not very severe but the memory footprint easily grows to GBs and when we have competing processes for memory, it gets very slow. I want to reserve some memory from OS so that thrashing is minimal even when too many other processes come. Jagmal
Let's take a different tack then. The problem isn't really with your Perl script in particular. Instead, all the processes on the machine are consuming too much memory for the machine to handle as configured.
You can "reserve" memory, but that won't prevent thrashing. In fact, it could make the problem worse because the OS won't know if you are using the memory or just saving it for later.
I suspect you are suffering the tragedy of the commons. Am I right that many other users are on the machine in question? If so, this is more of a social problem than a technical problem. What you need is someone (probably the System Administrator) to step in and coordinate all the processes on the machine. They should find the most extravagant memory hogs and work with their programmers to reduce the cost on system resources. Further, they ought to arrange for processes to be scheduled so that resource allocation is efficient. Finally, they may need to get more or improved hardware to handle the expected system load.
Some questions you might ask yourself:
are my data structures really useful for the task at hand?
do I really have to cache that much?
can I throw away cached data after some time?
my #array;
$#array = 1_000_000; # pre-extend array to one million elements,
# http://perldoc.perl.org/perldata.html#Scalar-values
my %hash;
keys(%hash) = 8192; # pre-allocate hash buckets
# (same documentation section)
Not being familiar with your code, I'll venture some wild speculation here [grin] that these techniques aren't going to offer new great efficiencies to your script, but that the pre-allocation could help a little bit.
Good luck!
-- Douglas Hunter
I recently rediscovered an excellent Randal L. Schwartz article that includes preallocating an array. Assuming this is your problem, you can test preallocating with a variation on that code. But be sure to test the result.
The reason the script gets slower with more caching might be thrashing. Presumably the reason for caching in the first place is to increase performance. So a quick answer is: reduce caching.
Now there may be ways to modify your caching scheme so that it uses less main memory and avoids thrashing. For instance, you might find that caching to a file or database instead of to memory can boost performance. I've found that file system and database caching can be more efficient than application caching and can be shared among multiple instances.
Another idea might be to alter your algorithm to reduce memory usage in other areas. For instance, instead of pulling an entire file into memory, Perl programs tend to work better reading line by line.
Finally, have you explored the Memoize module? It might not be immediately applicable, but it could be a source of ideas.
I could not find a way to do this yet.
But, I found out that (See this for details)
Memory allocated to lexicals (i.e.
my() variables) cannot be reclaimed or
reused even if they go out of scope.
It is reserved in case the variables
come back into scope. Memory allocated
to global variables can be reused
(within your program) by using
undef()ing and/or delete().
So, I believe a possibility here could be to check if i can reduce the total memory print of lexical variables at a given point in time.
It sounds like you are looking for limit or ulimit. But I suspect that will cause a script that goes over the limit to fail, which probably isn't what you want.
A better idea might be to share cached data between processes. Putting data in a database or in a file works well in my experience.
I hate to say it, but if your memory limitations are this severe, Perl is probably not the right language for this application. C would be a better choice, I'd think.
One thing you could do is to use solaris zones (containers) .
You could put your process in a zone and allocate it resources like RAM and CPU's.
Here are two links to some tutorials :
Solaris Containers How To Guide
Zone Resource Control in the Solaris 10 08/07 OS
While it's not pre-allocating as you asked for, you may also want to look at the large page size options, so that when perl has to ask the OS for more memory for your program, it gets it in
larger chunks.
See Solaris Internals: Multiple Page Size Support for more information on the difference this makes and how to do it.
Look at http://metacpan.org/pod/Devel::Size
You could also inline a c function to do the above.
As far as I know, you cannot allocate memory directly from Perl. You can get around this by writing an XS module, or using an inline C function like I mentioned.