I'm trying to run Valgrind on a mips32 machine in order to detect a memory leak. The total available memory is 32MB (without SWAP). The problem is that Valgrind itself is not able to allocate the amount of memory that he needs and always generates an "out of memory" error.
root#babidi# valgrind --leak-check=yes grep -r "foo" /etc/config/
==9392== Memcheck, a memory error detector
==9392== Copyright (C) 2002-2012, and GNU GPL'd, by Julian Seward et al.
==9392== Using Valgrind-3.8.1 and LibVEX; rerun with -h for copyright info
==9392== Command: grep -r foo /etc/config/
==9392==
==9392==
==9392== Valgrind's memory management: out of memory:
==9392== initialiseSector(TC)'s request for 27597024 bytes failed.
==9392== 20516864 bytes have already been allocated.
==9392== Valgrind cannot continue. Sorry.
==9392==
==9392== There are several possible reasons for this.
==9392== - You have some kind of memory limit in place. Look at the
==9392== output of 'ulimit -a'. Is there a limit on the size of
==9392== virtual memory or address space?
==9392== - You have run out of swap space.
==9392== - Valgrind has a bug. If you think this is the case or you are
==9392== not sure, please let us know and we'll try to fix it.
==9392== Please note that programs can take substantially more memory than
==9392== normal when running under Valgrind tools, eg. up to twice or
==9392== more, depending on the tool. On a 64-bit machine, Valgrind
==9392== should be able to make use of up 32GB memory. On a 32-bit
==9392== machine, Valgrind should be able to use all the memory available
==9392== to a single process, up to 4GB if that's how you have your
==9392== kernel configured. Most 32-bit Linux setups allow a maximum of
==9392== 3GB per process.
==9392==
==9392== Whatever the reason, Valgrind cannot continue. Sorry.
What I'm wondering is if it is possible to limit the amount of memory that Valgrind allocates. I tried playing with --max-stacksize and --max-stackframe but the result is always the same.
As mentioned in the comments, 32MB is not much. It must cover the OS and some other necessary processes. When you analyze a program with Valgrind/Memcheck, it requires more than twice as much memory than the program would do by itself. This is because Memcheck stores shadow values for every allocated bit, so that it can recognize uninitialized variables.
I think that the best solution would be to compile your program for your desktop computer and run Memcheck from there. If you have leaks, uninitialized variables, etc in your program, you will have them on your desktop computer as well.
If you are curious how your program will behave on the MIPS, analyze it with other Valgrind tools, such as Massif (measuring heap over time) and Cachegrind (cache performance). Those are much more light-weigth than Memcheck.
Related
I get a new error, when I try to compile an R Markdown file int appears the next message:
Error: C stack usage 7971408 is too close to the limit
Execution halted
I did some research and I found some people with the same error:
Error: C stack usage is too close to the limit
C stack usage 7970960 is too close to the limit
GenomicRanges: C stack usage ... is too close to the limit
R mapping (C stack usage 7971616 is too close to the limit)
C stack usage 7972356 is too close to the limit #335
But these guys have problems with some function or something like that.
The actions I did in orden to try to solve this situation:
Uninstall R and RStudio, reinstall de last versions of both, reboot my computer... nothing.
Try to change ulimit -s, and this point is interesting because this is my ulimit -a on R terminal:
geomicrobio-mac:~ geomicrobio$ ulimit -a
core file size (blocks, -c) 0
data seg size (kbytes, -d) unlimited
file size (blocks, -f) unlimited
max locked memory (kbytes, -l) unlimited
max memory size (kbytes, -m) unlimited
open files (-n) 10240
pipe size (512 bytes, -p) 1
stack size (kbytes, -s) 8192
cpu time (seconds, -t) unlimited
max user processes (-u) 1392
virtual memory (kbytes, -v) unlimited
When I try to change de ulimit -s for unlimited or 65532 on R terminal, it doesn't change.
The ulimit -a of my terminal (macOS Monterey v12.0.1) is:
-t: cpu time (seconds) unlimited
-f: file size (blocks) unlimited
-d: data seg size (kbytes) unlimited
-s: stack size (kbytes) 65532
-c: core file size (blocks) 0
-v: address space (kbytes) unlimited
-l: locked-in-memory size (kbytes) unlimited
-u: processes 1392
-n: file descriptors 2560
This just happen with R Markdown, I can do Shinny apps, and run scripts, etc. but I can`t compile any R Markdown despite it contains only text.
This is the info when I put base::Cstack_info() on console:
size current direction eval_depth
7969177 14032 1 2
My version of R:
platform x86_64-apple-darwin17.0
arch x86_64
os darwin17.0
system x86_64, darwin17.0
status
major 4
minor 1.2
year 2021
month 11
day 01
svn rev 81115
language R
version.string R version 4.1.2 (2021-11-01)
nickname Bird Hippie
If you know how to solve this I really appreciate your help.
Thank you.
I just delete the .Rprofile .-.
I'm testing a simple MPI program on my desktop (Ubuntu LTS 16.04/ Intel® Core™ i3-6100U CPU # 2.30GHz × 4/ gcc 4.8.5 /OpenMPI 3.0.0) and mpirun won't let me use all of the cores on my machine (4). When I run:
$ mpirun -n 4 ./test2
I get the following error:
--------------------------------------------------------------------------
There are not enough slots available in the system to satisfy the 4 slots
that were requested by the application:
./test2
Either request fewer slots for your application, or make more slots available
for use.
--------------------------------------------------------------------------
But if I run with:
$ mpirun -n 2 ./test2
everything works fine.
I've seen from other answers that I can check the number of processors with
cat /proc/cpuinfo | grep processor | wc -l
and this tells me that I have 4 processors. I'm not interested in oversubscribing, I'd just like to be able to use all my processors. Can anyone help?
Your processor has 4 hyperthreads but only 2 cores (see the specs here).
By default, Open MPI does not run more than one MPI task per core.
You can have Open MPI run up to one MPI task per hyperthread with the following option
mpirun --use-hwthread-cpus ...
FWIW
The command you mentioned reports the number of hyperthreads.
A better way to figure out the topology of a machine is via the lstopo command from the hwloc package.
MPI tasks are not bound on cores nor threads on OS X, so if you are running on a Mac, the --oversubscribe -np 4 would lead to the same result.
To resolve your problem, you can use the --use-hwthread-cpus command line arguments for mpirun, as already pointed out by Gilles Gouaillardet. In this case, Open MPI will treat the thread provided by hyperthreading as the Open MPI processor. Otherwise, it will treat a CPU core as an Open MPI processor, which is the default behavior. When using --use-hwthread-cpus, it will correctly determine the total number of processors available to you, that is, all processors available on all hosts specified in the Open MPI host file. Therefore, you do not need to specify the "-n" parameter. In addition, when using the --use-hwthread-cpus command line parameter, Open MPI refers to the threads provided by hyperthreading as "hardware threads". With this technique, you will not oversubscribe, and if some Open MPI processor will run on a virtual machine, it will use the correct number of threads assigned to that virtual machine. And if your processor has more than two threads per core, as a Xeon Phi (Knights Mill, Knights Landing, etc.), it will take all four threads per core as an Open MPI processor.
Use $ lscpu the number of cores per socket * number of sockets would give you number of physical cores(the ones that you can use for mpi) where as number of cores per socket * number of sockets * threads per core will give you number of logical cores(the one that you get by using the command $ cat /proc/cpuinfo | grep processor | wc -l)
I have an access to MPI cluster. It is a pure, clean lan cluster, no SLURM or anething except OpenMP, mpicc, mpirun installed. I have sudo rights. Accessible and configured MPI nodes are all listed in /etc/hosts. I can compile and run MPI programms, yet how to get information on MPI cluster abilities: totall cores avaliable, processors info, total memory, currently running tasks?
Generaly I search for analog of sinfo and squeue that would work in MPI environment?
total cores avaliable:
total memory:
You can try to use Portable Hardware Locality hwloc to see the hardware topology and get info about total cores and total memory.
Additionally you can get information about CPU using lscpu or cat /proc/cpuinfo
currently running tasks:
You can use the monitoring software nmon from IMB (its free)
The option -t of nmon reports the top running process (like top command). You can use nmon online or offline mode.
The following example is from IMB developerWorks
nmon -fT -s 30 -c 120
Is getting one "snapshot" every 30 seconds until it gets 120 snapshots. Then you can examine the output.
If you run it without -f you will see the results live
I would like to write slurm batches (sbatch) to run several mpi applications. Thus I would like to be able to run something like that
salloc --nodes=1 mpirun -n 6 hostname
But I get this message :
There are not enough slots available in the system to satisfy the 6 slots
that were requested by the application:
hostname
Either request fewer slots for your application, or make more slots available for use.
The node has actually 4 CPUs. I therefore looking for something allowing more task per CPU but I cannot find the proper option. I know that mpi alone is able to run several processes when physical resources are missing. I think the problem is on the slurm side.
Do you have any suggestions/comments?
Use srun and supply the option --overcommit, e.g. like that:
test.job:
#!/bin/bash
#SBATCH --nodes=1
#SBATCH --ntasks=6
#SBATCH --overcommit
srun hostname
Run sbatch test.job
From man srun:
Normally, srun will not allocate more than one process per CPU. By specifying --overcommit you are explicitly allowing more than one process per CPU.
Note depending on your cluster configuration this may or may not work also with mpirun, but I'd stick with srun unless you have a good reason not to.
An important warning: Most MPI implementations by default have terrible performance when running in overcommited. How to address that is a different, much more difficult, question.
what is the command to know the L2 cache size of CPU on Solaris operating system running on Sparc and x86 processors.
I don't have access to a Solaris box to test this out, but you might be able to achieve this using prtpicl.
prtpicl -v -c cpu | grep l2-cache-size
For a more portable option, check out the lstopo command from the hwloc project.
On sparc just run fpversion (/product/SUNWspro/bin/fpversion) and it will print the xcache code-generation options that show the L1 and L2 cache sizes. Then read http://docs.oracle.com/cd/E19205-01/819-5267/bkazt/index.html to understand it.