I am loading some large .csv files on an RStudio server which require circa 73gb. The server is a Ubuntu 18.04.5 LTS with 125GB RAM, so theoretically I should be able to load more data on RAM and do computations which require more RAM. However, RStudio Server denies that with the following error message:
"Error: cannot allocate vector of size 3.9 Gb".
Looking into the free -mh it seems that there is available memory of 42GB, but RStudio does not utilise it.
~:$ free -mh
total used free shared buff/cache available
Mem: 125G 81G 1.2G 28M 43G 42G
Swap: 979M 4.0M 975M`
Why is this happening and how can I utilise these 42GB for computations?
top output:
top - 11:05:58 up 21:19, 1 user, load average: 0.24, 0.22, 0.22
Tasks: 307 total, 1 running, 201 sleeping, 0 stopped, 0 zombie
%Cpu(s): 0.6 us, 0.2 sy, 0.1 ni, 99.0 id, 0.0 wa, 0.0 hi, 0.0 si, 0.0 st
KiB Mem : 13170917+total, 1216824 free, 85389312 used, 45103040 buff/cache
KiB Swap: 1003516 total, 999420 free, 4096 used. 45033504 avail Mem
PID USER PR NI VIRT RES SHR S %CPU %MEM TIME+ COMMAND
4292 user1 30 10 39956 5052 4272 S 2.6 0.0 0:06.94 galaxy
1171 root 20 0 488892 44216 7540 S 1.3 0.0 12:24.39 cbdaemon
5998 user1 20 0 44428 3936 3168 R 1.0 0.0 0:00.07 top
6153 user1 20 0 275740 101112 29396 S 0.7 0.1 25:55.97 nxagent
1207 root 20 0 377116 163944 68132 S 0.3 0.1 5:37.13 evl_manager
1247 root 20 0 1262092 63308 14956 S 0.3 0.0 0:35.77 mcollectived
6623 user1 20 0 126640 5852 3212 S 0.3 0.0 0:02.48 sshd
7413 user1 20 0 80.623g 0.078t 53712 S 0.3 63.8 20:14.46 rsession
1 root 20 0 225756 8928 6320 S 0.0 0.0 0:23.05 systemd
2 root 20 0 0 0 0 S 0.0 0.0 0:00.02 kthreadd
...
Surprisingly, the problem was not on the OS or R/RStudio Sever, but instead on the use of data.table, which I used to open the .csv files. Apparently, it prevented the system from using the available memory. Once I moved to dplyr, I had no problem utilising all the memory the system has.
Related
I have compiled Quantum ESPRESSO (Program PWSCF v.6.7MaX) for GPU acceleration (hibrid MPI/OpenMP) with the next options:
module load compiler/intel/2020.1
module load hpc_sdk/20.9
./configure F90=pgf90 CC=pgcc MPIF90=mpif90 --with-cuda=yes --enable-cuda-env-check=no --with-cuda-runtime=11.0 --with-cuda-cc=70 --enable-openmp BLAS_LIBS='-lmkl_intel_lp64 -lmkl_intel_thread -lmkl_core'
make -j8 pw
Apparently, the compilation ends succesfully. Then, I execute the program:
export OMP_NUM_THREADS=1
mpirun -n 2 /home/my_user/q-e-gpu-qe-gpu-6.7/bin/pw.x < silverslab32.in > silver4.out
Then, the program starts running and print out the next info:
Parallel version (MPI & OpenMP), running on 8 processor cores
Number of MPI processes: 2
Threads/MPI process: 4
...
GPU acceleration is ACTIVE
...
Estimated max dynamical RAM per process > 13.87 GB
Estimated total dynamical RAM > 27.75 GB
But after 2 minutes of execution the job ends with error:
0: ALLOCATE: 4345479360 bytes requested; status = 2(out of memory)
0: ALLOCATE: 4345482096 bytes requested; status = 2(out of memory)
--------------------------------------------------------------------------
Primary job terminated normally, but 1 process returned
a non-zero exit code. Per user-direction, the job has been aborted.
--------------------------------------------------------------------------
--------------------------------------------------------------------------
mpirun detected that one or more processes exited with non-zero status, thus causing
the job to be terminated. The first process to do so was:
Process name: [[47946,1],1]
Exit code: 127
--------------------------------------------------------------------------
This node has > 180GB of available RAM. I check the Memory use with the top command:
PID USER PR NI VIRT RES SHR S %CPU %MEM TIME+ COMMAND
89681 my_user 20 0 30.1g 3.6g 2.1g R 100.0 1.9 1:39.45 pw.x
89682 my_user 20 0 29.8g 3.2g 2.0g R 100.0 1.7 1:39.30 pw.x
I noticed that the process stops when RES memory reaches 4GB. This are the caracteristics of the node:
(base) [my_user#gpu001]$ numactl --hardware
available: 2 nodes (0-1)
node 0 cpus: 0 1 2 3 4 5 6 7 8 9 10 11 12 13 28 29 30 31 32 33 34 35 36 37 38 39 40 41
node 0 size: 95313 MB
node 0 free: 41972 MB
node 1 cpus: 14 15 16 17 18 19 20 21 22 23 24 25 26 27 42 43 44 45 46 47 48 49 50 51 52 53 54 55
node 1 size: 96746 MB
node 1 free: 70751 MB
node distances:
node 0 1
0: 10 21
1: 21 10
(base) [my_user#gpu001]$ free -lm
total used free shared buff/cache available
Mem: 192059 2561 112716 260 76781 188505
Low: 192059 79342 112716
High: 0 0 0
Swap: 8191 0 8191
The version of MPI is:
mpirun (Open MPI) 3.1.5
This node is a compute node in a cluster, but no matter if I submit the job with SLURM or run it directly on the node, the error is the same.
Note that I compile it on the login node and run it on this GPU node, the difference is that on the login node it has no GPU connected.
I would really appreciate it if you could help me figure out what could be going on.
Thank you in advance!
Closed. This question does not meet Stack Overflow guidelines. It is not currently accepting answers.
This question does not appear to be about a specific programming problem, a software algorithm, or software tools primarily used by programmers. If you believe the question would be on-topic on another Stack Exchange site, you can leave a comment to explain where the question may be able to be answered.
Closed 2 years ago.
Improve this question
root#0cd6bfb7e363 app]# s6-svc /etc/services.d/uwsgi/
s6-svc: fatal: unable to control /etc/services.d/uwsgi/: supervisor not listening
[root#0cd6bfb7e363 app]# s6-svc -r /etc/services.d/uwsgi/
s6-svc: fatal: unable to control /etc/services.d/uwsgi/: supervisor not listening
however
[root#0cd6bfb7e363 app]# ps aufx
USER PID %CPU %MEM VSZ RSS TTY STAT START TIME COMMAND
root 1 0.0 0.0 204 4 pts/0 Ss 01:42 0:00 s6-svscan -t0 /var/run/s6/services
root 30 0.0 0.0 188 8 pts/0 S 01:42 0:00 foreground if /etc/s6/init/init-stage2-redirfd foreground if if s6-echo -n --
root 40 0.0 0.0 184 4 pts/0 S 01:42 0:00 \_ foreground s6-setsid -gq -- with-contenv backtick -D 0 -n S6_LOGGING printcontenv S6_LO
root 261 0.0 0.0 15540 3864 pts/0 S 01:42 0:00 \_ /bin/bash
root 516 0.0 0.0 55204 3932 pts/0 R+ 01:52 0:00 \_ ps aufx
root 31 0.0 0.0 204 4 pts/0 S 01:42 0:00 s6-supervise s6-fdholderd
root 241 0.0 0.0 204 4 pts/0 S 01:42 0:00 s6-supervise nginx
root 246 0.0 0.0 56840 7228 ? Ss 01:42 0:00 \_ nginx: master process /usr/sbin/nginx
nginx 267 0.0 0.0 57484 5048 ? S 01:42 0:00 \_ nginx: worker process
root 243 0.0 0.0 204 4 pts/0 S 01:42 0:00 s6-supervise uwsgi
root 245 0.0 0.0 15140 3012 ? Ss 01:42 0:00 \_ bash ./run
root 255 0.2 0.0 259944 82060 ? S 01:42 0:01 \_ uwsgi /etc/uwsgi.ini
root 339 0.0 0.0 259944 70452 ? S 01:43 0:00 \_ uwsgi /etc/uwsgi.ini
root 340 0.0 0.0 259944 70452 ? S 01:43 0:00 \_ uwsgi /etc/uwsgi.ini
root 341 0.1 0.0 297500 96408 ? S 01:43 0:01 \_ uwsgi /etc/uwsgi.ini
root 342 0.0 0.0 259944 70452 ? S 01:43 0:00 \_ uwsgi /etc/uwsgi.ini
What could be wrong?
Because s6-overlay copies the services to a different folder. Look at the first line in the ps dump:
[root#0cd6bfb7e363 app]# ps aufx
USER PID %CPU %MEM VSZ RSS TTY STAT START TIME COMMAND
root 1 0.0 0.0 204 4 pts/0 Ss 01:42 0:00 s6-svscan -t0 /var/run/s6/services
The s6-supervisor is listening to the services under /var/run/s6/services, therefore what you should do is:
$ s6-svc -u /var/run/s6/services/uwsgi
It is also explained in the README file:
Stage 2.iii) Copy user services (/etc/services.d) to the folder where s6 is running its supervision and signal it so that it can properly start supervising them.
and
You may want to use a Read-Only Root Filesystem, since that ensures s6-overlay copies files into the /var/run/s6 directory rather than use symlinks.
I have a site serving Wordpress + Nginx + PHP5-FPM + **WP Super Cache. The server is a Digital Ocean droplet, with 4gb of memory and 2 cores. At times (like yesterday), CPU usage is around 90% and the site will become very slow. Most of the time it is between 30-50%.
Other articles I've read about "high" Nginx and PHP5-FPM usage mention rates of 40%+ for a single process. I'm not experiencing anything that unusual, but I wonder if my statistics are reasonable, given the machine and my Nginx configuration.
I've pasted nginx.conf here: http://pastebin.com/fD9HjqpB
General wordpress configuration file (for Nginx): http://pastebin.com/CVkHZtvU
And the site-specific config here: http://pastebin.com/KsVECwEa
Cpu(s): 47.8%us, 37.3%sy, 0.0%ni, 14.8%id, 0.0%wa, 0.0%hi, 0.0%si, 0.0%st
Mem: 4049968k total, 1363176k used, 2686792k free, 35376k buffers
Swap: 0k total, 0k used, 0k free, 549984k cached
PID USER PR NI VIRT RES SHR S %CPU %MEM TIME+ COMMAND
667 mysql 20 0 1314m 85m 7908 S 89 2.2 17:10.21 mysqld
1679 www-data 20 0 229m 54m 3836 S 15 1.4 0:04.23 php5-fpm
1688 www-data 20 0 234m 59m 3696 R 12 1.5 0:01.21 php5-fpm
1682 www-data 20 0 237m 61m 3780 S 10 1.6 0:02.93 php5-fpm
1681 www-data 20 0 231m 56m 3848 S 9 1.4 0:03.26 php5-fpm
1684 www-data 20 0 237m 62m 3816 S 9 1.6 0:02.15 php5-fpm
1680 www-data 20 0 240m 65m 5204 S 7 1.7 0:03.47 php5-fpm
1686 www-data 20 0 237m 62m 3836 S 6 1.6 0:01.33 php5-fpm
1687 www-data 20 0 237m 62m 3820 S 6 1.6 0:00.97 php5-fpm
1691 www-data 20 0 229m 54m 3828 S 6 1.4 0:00.52 php5-fpm
721 www-data 20 0 130m 57m 1104 S 0 1.5 0:00.48 nginx
Is there anything I could be optimizing further to lower CPU usage and avoid the 90%+ usage periods? Obviously we could increase the size of the droplet, but I'm hoping to avoid scaling with hardware vs. just optimizing if there is opportunity.
Any insight would be greatly appreciated!
Closed. This question is off-topic. It is not currently accepting answers.
Want to improve this question? Update the question so it's on-topic for Stack Overflow.
Closed 11 years ago.
The community reviewed whether to reopen this question 8 months ago and left it closed:
Not suitable for this site This question does not appear to be about a specific programming problem, a software algorithm, or software tools primarily used by programmers. If you believe the question would be on-topic on another Stack Exchange site, you can leave a comment to explain where the question may be able to be answered.
Improve this question
32537 apache 16 0 87424 15m 7324 S 2.3 0.3 0:00.52 httpd
3302 mysql 15 0 156m 41m 4756 S 1.3 0.7 10:50.91 mysqld
489 apache 16 0 87016 14m 6692 S 0.7 0.2 0:00.27 httpd
990 apache 15 0 0 0 0 Z 0.7 0.0 0:00.12 httpd <defunct>
665 apache 15 0 86992 13m 5644 S 0.3 0.2 0:00.20 httpd
32218 apache 15 0 87356 14m 6344 S 0.3 0.2 0:00.53 httpd
1 root 15 0 2160 640 556 S 0.0 0.0 0:01.18 init
From top, there is an occasional httpd <defunct> showing up. What does it do?
I found that the web server sometimes does not response to FPDF (print PDF at user's request). Is it related?
UPDATE, with loading information:
top - 11:55:59 up 17:30, 6 users, load average: 0.53, 0.47, 0.80
Tasks: 322 total, 1 running, 320 sleeping, 0 stopped, 1 zombie
Cpu(s): 0.7%us, 0.2%sy, 0.0%ni, 95.1%id, 3.9%wa, 0.0%hi, 0.1%si, 0.0%st
Mem: 6219412k total, 5944068k used, 275344k free, 21024k buffers
Swap: 5140792k total, 96k used, 5140696k free, 5270708k cached
PID USER PR NI VIRT RES SHR S %CPU %MEM TIME+ COMMAND
1951 apache 16 0 0 0 0 Z 0.9 0.0 0:00.33 httpd <defunct>
2267 apache 15 0 86992 13m 5876 S 0.9 0.2 0:00.22 httpd
3302 mysql 15 0 156m 41m 4756 S 0.9 0.7 11:43.72 mysqld
2220 apache 15 0 87204 14m 6496 S 0.6 0.2 0:00.28 httpd
2340 apache 15 0 87828 13m 5588 S 0.6 0.2 0:00.22 httpd
2341 apache 17 0 88236 14m 5564 S 0.6 0.2 0:00.15 httpd
842 apache 16 0 87432 15m 7180 S 0.3 0.2 0:00.81 httpd
2225 apache 18 0 88236 14m 5560 S 0.3 0.2 0:00.17 httpd
2401 apache 15 0 86916 12m 5344 S 0.3 0.2 0:00.11 httpd
1 root 24 0 2160 640 556 S 0.0 0.0 0:01.18 init
A defunct process is a process that has exited but whose parent has not yet waited on it to read its exit status, leaving an entry in the process table. Also known as a zombie process. See the Wikipedia article for more information.
A defunct process is typically one which has finished but the OS keeps it around until the parent waits for it to "collect" its status. You only normally see lots of this when you've written your own "forky" code and have bugs.
If you use
ps -Hwfe
You will get to see the process hierarchy and so what the parent is. Weird that it's an httpd process, it's normally pretty good at collecting its children. Unless your system is flat out, which is why you're using top in the first place...
When a process dies on Unix, it sends an exit code to its parent. A defunct process, or "zombie", is one whose parent hasn't yet looked at the zombie's exit code. Once the parent gets the exit code (using the wait system call), the zombie will disappear.
I am submitting a job using qsub that runs parallelized R. My
intention is to have R programme running on 4 different cores rather than 8 cores. Here are some of my settings in PBS file:
#PBS -l nodes=1:ppn=4
....
time R --no-save < program1.R > program1.log
I am issuing the command ta job_id and I'm seeing that 4 cores are listed. However, the job occupies a large amount of memory(31944900k used vs 32949628k total). If I were to use 8 cores, the jobs got hang due to memory limitation.
top - 21:03:53 up 77 days, 11:54, 0 users, load average: 3.99, 3.75, 3.37
Tasks: 207 total, 5 running, 202 sleeping, 0 stopped, 0 zombie
Cpu(s): 30.4%us, 1.6%sy, 0.0%ni, 66.8%id, 0.0%wa, 0.0%hi, 1.2%si, 0.0%st
Mem: 32949628k total, 31944900k used, 1004728k free, 269812k buffers
Swap: 2097136k total, 8360k used, 2088776k free, 6030856k cached
Here is a snapshot when issuing command ta job_id
PID USER PR NI VIRT RES SHR S %CPU %MEM TIME+ COMMAND
1794 x 25 0 6247m 6.0g 1780 R 99.2 19.1 8:14.37 R
1795 x 25 0 6332m 6.1g 1780 R 99.2 19.4 8:14.37 R
1796 x 25 0 6242m 6.0g 1784 R 99.2 19.1 8:14.37 R
1797 x 25 0 6322m 6.1g 1780 R 99.2 19.4 8:14.33 R
1714 x 18 0 65932 1504 1248 S 0.0 0.0 0:00.00 bash
1761 x 18 0 63840 1244 1052 S 0.0 0.0 0:00.00 20016.hpc
1783 x 18 0 133m 7096 1128 S 0.0 0.0 0:00.00 python
1786 x 18 0 137m 46m 2688 S 0.0 0.1 0:02.06 R
How can I prevent other users from using the other 4 cores? I like to mask somehow that my job is using 8 cores with 4 cores idling.
Could anyone kindly help me out on this? Can this be solved using pbs?
Many Thanks
"How can I prevent other users from using the other 4 cores? I like to mask somehow that my job is using 8 cores with 4 cores idling."
Maybe a simple way around it is to send a 'sleep' job on the other 4? Seems hackish though! (ans warning, my PBS is rusty!)
Why not do the following -
ask PBS for ppn=4, additionally, ask for all the memory on the node, i e
#PBS -l nodes=1:ppn=4 -l mem=31944900k
This might not be possible on your setup.
I am not sure how R is parallelized, but if it is OPENMP you could definitely ask for 8 cores but set OMP_NUM_THREADS to 4