Not enough space when running Hudson jobs - unix

My Hudson jobs are crashing on each run with this error:
Caused by: java.io.IOException: error=12, Not enough space
at java.lang.UNIXProcess.forkAndExec(Native Method)
I found documention on StackOverflow and on the Jenkins website regarding this error, which indicate a problem of swap space (https://wiki.jenkins-ci.org/display/JENKINS/IOException+Not+enough+space).
However, maybe my problem is different or not, but if I launch the process manually it works fine.
A weird thing is I see different resuls from top of from prstat:
Specs:
Hudson processes are running in their own Unix user
OS: SunOS dc5c00-d12 5.10 Generic_147440-19 sun4v sparc sun4v
Memory:
from top:
32G phys mem, 6255M free mem, 16G total swap, 16G free swap
from prstat
NPROC USERNAME SWAP RSS MEMORY TIME CPU
50 user1 12G 12G 39% 89:02:31 0.3%
36 user2 11G 6779M 21% 155:17:41 0.0%
26 user3 10G 8509M 26% 4787:37:4 8.0%
6 hudson 572M 556M 1.7% 0:08:25 0.0%
57 root 280M 285M 0.9% 138:46:05 0.0%
Can anywone confirm if I have a swap issue? top shows 16GB free...
EDIT:
results from swap -s (after problem being remporarly resolved)
total: 19940168k bytes allocated + 12578048k reserved = 32518216k used, 4118208k available
.

It is certainly a swap issue.
top is reporting as free swap blocks that do not contain paginated data. However, even while unused, some of these blocks can be reserved (i.e untouched still allocated virtual memory). When you have no more blocks to back memory reservations, you got this "Not enough space" exception.
swap -s shows your applications are reserving more that 12 GB while your swap area is just 16 GB. I would double the size of your swap to prevent virtual memory shortage in your case.

Related

Mysterious mariadb 10.4.1 ram ussage

i have upgraded from mariadb 10.1.36 to 10.4.8 and i can see mysterious increasing ram ussage on that new version. I also edited innodb_buffer_pool_size ant seems there is no effect if its set to 15M or 4G, ram is just slowly increasing. After while it eat whole ram and oom killer kills mariadb and this is repeating.
My server has 8GB RAM and its increasing like 60-150MB per day. Its not terrible but i have around 150 database servers so its huge problem.
I can temporary fix problem by restarting mariadb and its start again.
Info about database server:
databases: 200+
tables: 28200(141 per database)
average active connections: 100-200
size of stored data: 100-350GB
cpu: 4
ram: 8GB
there is my config:
server-id=101
datadir=/opt/mysql/
socket=/var/lib/mysql/mysql.sock
tmpdir=/tmp/
gtid-ignore-duplicates=True
log_bin=mysql-bin
expire_logs_days=4
wait_timeout=360
thread_cache_size=16
sql_mode="ALLOW_INVALID_DATES"
long_query_time=0.8
slow_query_log=1
slow_query_log_file=/opt/log/slow.log
log_output=TABLE
userstat = 1
user=mysql
symbolic-links=0
binlog_format=STATEMENT
default_storage_engine=InnoDB
slave_skip_errors=1062,1396,1690innodb_autoinc_lock_mode=2
innodb_buffer_pool_size=4G
innodb_buffer_pool_instances=5
innodb_log_file_size=1G
innodb_log_buffer_size=196M
innodb_flush_log_at_trx_commit=1
innodb_thread_concurrency=24
innodb_file_per_table
innodb_write_io_threads=24
innodb_read_io_threads=24
innodb_adaptive_flushing=1
innodb_purge_threads=5
innodb_adaptive_hash_index=64
innodb_flush_neighbors=0
innodb_flush_method=O_DIRECT
innodb_io_capacity=10000
innodb_io_capacity_max=16000
innodb_lru_scan_depth=1024
innodb_sort_buffer_size=32M
innodb_ft_cache_size=70M
innodb_ft_total_cache_size=1G
innodb_lock_wait_timeout=300
slave_parallel_threads=5
slave_parallel_mode=optimistic
slave_parallel_max_queued=10000000
log_slave_updates=on
performance_schema=on
skip-name-resolve
max_allowed_packet = 512M
query_cache_type=0
query_cache_size = 0
query_cache_limit = 1M
query_cache_min_res_unit=1K
max_connections = 1500
table_open_cache=64K
innodb_open_files=64K
table_definition_cache=64K
open_files_limit=1020000
collation-server = utf8_general_ci
character-set-server = utf8
log-error=/opt/log/error.log
log-error=/opt/log/error.log
pid-file=/var/run/mysqld/mysqld.pid
malloc-lib=/usr/lib64/libjemalloc.so.1
I solved it! The problem is memory allocation library.
If you do this SQL query:
SHOW VARIABLES LIKE 'version_malloc_library';
You must to get value "jemalloc" library. If you get only "system", you may have problems.
To change that, you need edit any .conf file in this directory:
/etc/systemd/system/mariadb.service.d/
There, add this line:
Environment="LD_PRELOAD=/usr/lib/x86_64-linux-gnu/libjemalloc.so.1"
(this library file may be in other folder)
Then you must to restart mysqld
service mysqld stop && systemctl daemon-reload && service mysqld start
You got carried away in increasing values in my.cnf.
Many of the caches grow until hitting their limit, hence the memory growth you experienced.
What is the value from SHOW GLOBAL STATUS LIKE 'Max_used_connections';? Having a large max_connections accentuates several of the other values; lower it.
But perhaps the really bad one(s) involve table caches -- which have units of tables, not bytes. Crank these down a lot:
table_open_cache=64K
innodb_open_files=64K
table_definition_cache=64K
I have exactly the same problem. Is it due to a bad configuration? Or is it a bug of the new version?
mariadb 10.1 was updated to 10.3 just when I upgraded Debian 9 to Debian 10. I tried solve the problem with mariadb 10.4 but nothing changed.
I want to downgrade version but I think it's neccesary dump all database and restore it, and that means being hours without service.
I don't think Debian 10 has to do with the issue
Please read my previous comments about alternative memory allocators...
When jemalloc is used:
When default memory allocator used:
Try with tcmalloc
Environment="LD_PRELOAD=/usr/lib/x86_64-linux-gnu/libtcmalloc_minimal.so.4.5.3.1"

502 Gitlab is taking too much time to respond

After taking gitlab backup everyday gitlab is throwing 502 error.
I saw nginx logs but did not find that much information.
After gitlab-ctl restart it starts working again.
System Configurations:
OS : Ubuntu 16.04 LTS
4 GB Ram
200 GB Disk Space
can anyone give permanent solution for it.
There is a high possibility that it run out of shared memory. As each time after the backup you got the 502 error.
To check it with gitlab-ctl tail tail detail
It will show something like:
2019-04-12_12:37:17.27154 FATAL: could not map anonymous shared memory: Cannot allocate memory
2019-04-12_12:37:17.27157 HINT: This error usually means that PostgreSQL's request for a shared memory segment exceeded available memory, swap space, or huge pages. To reduce the request size (currently 4345470976 bytes), reduce PostgreSQL's shared memory usage, perhaps by reducing shared_buffers or max_connections.
2019-04-12_12:37:17.27171 LOG: database system is shut down
Then check it with free -m, which shows there is no available shared memory.
total used free shared buffers cached
Mem: 16081 13715 2365 0 104 753
-/+ buffers/cache: 12857 3223
Then you need to check if there is some process take too many shared memory, or too many zomibe process, then kill it with command like ps -aef | grep ffmpeg | awk '{print $2}' | xargs kill 9
Check it with free -h, there is about 112M shared memory now.
total used free shared buffers cached
Mem: 15G 4.4G 11G 112M 46M 416M
-/+ buffers/cache: 3.9G 11G
Swap: 0B 0B 0B
At last,restart you gitlab with gitlab-ctl restart, after sometime the gitlab booted, the 502 gone.
After long search i got something about it. After taking backup my gitlab-workhorse is getting ideal and gitlab.socket is refusing the connection. As temporary solution i have installed a new cron job for restarting gitlab service after the complpetion of gitlab backup cronjob.
If the gitlab is installed in Virtual-Box - Ubuntu server either 18.04 or 20.04,
please increase the RAM to 4gb and the provide atleast 3 processors.

Zimbra not releasing space

We are using Zimbra 8.0 collaboration server open source edition. (Release 8.0.0_GA_5434.RHEL6_64_20120907144639 CentOS6_64 FOSS edition ) on centos6.
While we deleting an Account or Emptying Folder , the free Space in HDD is not increasing.
For Example, If we deleted an account having 10GB size, only 1GB is getting added to Total free space of HDD.
In detail, I Emptied an Account having size 10.1GB.
The Space occupied by the Store Folder before deleting is 10.1 GB
/opt/zimbra/store/0/443 - 10.1 GB
After Emptying inbox folder the size of the store folder reduced to 100 MB.
/opt/zimbra/store/0/443 - 100 MB
(Emptied using command zmmailbox -z -m xxx#xxx.com emptyFolder /Inbox)
But NO change in "df" output. The Total free space remains the same.
Accordingly, Space is not releasing while deleting an account.
Anyone Please help to find a solution in this issue. What need to be done to get free space added to HDD. Help needed.
I don't know what your dumpster settings are, though they could be determined by a command like this:
zmprov ga xxx#xxx.com |grep Dumpster
zimbraDumpsterEnabled: TRUE
zimbraDumpsterPurgeEnabled: TRUE
zimbraDumpsterUserVisibleAge: 10d
zimbraMailDumpsterLifetime: 10d
I suspect that the messages are sticking around in the dumpster though. You can clean them out with a command like this:
zmmailbox -z -m xxx#xxx.com -A emptyDumpster

Openstack Instance does not use the entire hard disk

I created new vm instance using "Ubuntu Server 10.04 LTS (Lucid Lynx) - 32 bits" image and m1.small falvour which has 20 GB Disk (OpenStack Icehouse). When i logging to the vm and run df -h , I found that the VM does not use the entire assigned HD. The command results are shown as the following:
Filesystem Size Used Avail Use% Mounted on
/dev/vda1 1.4G 595M 721M 46% /
none 1005M 144K 1005M 1% /dev
none 1007M 0 1007M 0% /dev/shm
none 1007M 36K 1007M 1% /var/run
none 1007M 0 1007M 0% /var/lock
none 1007M 0 1007M 0% /lib/init/rw
The "fdisk -l" shows the DH size is 20 GB:
Disk /dev/vda: 21.5 GB, 21474836480 bytes
4 heads, 32 sectors/track, 327680 cylinders
Units = cylinders of 128 * 512 = 65536 bytes
Sector size (logical/physical): 512 bytes / 512 bytes
I/O size (minimum/optimal): 512 bytes / 512 bytes
Disk identifier: 0x000cb9da
Device Boot Start End Blocks Id System
/dev/vda1 * 17 32768 2096128 83 Linux
I need the vm to take the full space assigned to it. Any idea how could I fix it? I want the solution to be applied on each vm I create, so I do not want to manually update the VM after instantiation. I also must use 10.04 image ( can not upgrdate to 14.04)
The problem here is the image. I grabbed that one and ran it up, it's pretty simple to run a
sudo resize2fs /dev/vda1
which will resize the filesystem to the size of the partition, which seems to be 2GB. Beyond that, you have to increase the partition size. For that I think you're probably best off using virt-resize, there are some good howto's out there e.g. askubuntu, in essence:
SSH into your openstack controller node
source keystonerc_admin (or whatever yours may be called)
nova list --all-tenants | grep <instance_name> or just grab the server guid from horizon
nova show <server_guid> and note which nova host your machine is running on. Also note the instance name (e.g. instance-00000adb)
SSH into that nova node
virsh dumpxml instance-00000adb and look for the image file. On mine, this is /var/lib/nova/instances/<server_guid>/disk but that may not always be the case?
yum install libguestfs-tools
truncate -r /var/lib/nova/instances/d887249a-0d95-473e-b4f2-41f71df4dbb5/disk /var/lib/nova/instances/d887249a-0d95-473e-b4f2-41f71df4dbb5/disk.new
truncate -s +2G /var/lib/nova/instances/d887249a-0d95-473e-b4f2-41f71df4dbb5/disk.new
virt-resize --expand /dev/sda1 /var/lib/nova/instances/d887249a-0d95-473e-b4f2-41f71df4dbb5/disk /var/lib/nova/instances/d887249a-0d95-473e-b4f2-41f71df4dbb5/disk.new
mv disk disk.old ; mv disk.new disk
NB - mine didn't quite work when I booted that up again, not got time to investigate yet but it can't be far off that, and hopefully this helps.
Once you've managed to boot that up again, then you can shut it down and create a snapshot from horizon. You can then use that snapshot just like any other image, and launch all subsequent VMs directly from there.
HTH.

GridGain Out of Memory Exception: Unable to create new native thread

I'm trying to create more then 2 instances of Grid Gain (Just by running the shell script) in Red Hat Release 6.5 (Santiago), but i get the following error when i try to run the shell script a 3rd time:
java.lang.OutOfMemoryError: unable to create new native thread
at java.lang.Thread.start0(Native Method)
at java.lang.Thread.start(Thread.java:714)
at java.util.concurrent.ThreadPoolExecutor.addWorker(ThreadPoolExecutor.java:949)
at java.util.concurrent.ThreadPoolExecutor.prestartAllCoreThreads(ThreadPoolExecutor.java:1604)
at org.gridgain.grid.kernal.GridGainEx$GridNamedInstance.start0(GridGainEx.java:1507)
at org.gridgain.grid.kernal.GridGainEx$GridNamedInstance.start(GridGainEx.java:1289)
at org.gridgain.grid.kernal.GridGainEx.start0(GridGainEx.java:832)
at org.gridgain.grid.kernal.GridGainEx.start(GridGainEx.java:759)
at org.gridgain.grid.kernal.GridGainEx.start(GridGainEx.java:677)
at org.gridgain.grid.kernal.GridGainEx.start(GridGainEx.java:524)
at org.gridgain.grid.kernal.GridGainEx.start(GridGainEx.java:494)
at org.gridgain.grid.GridGain.start(GridGain.java:314)
at org.gridgain.grid.startup.cmdline.GridCommandLineStartup.main(GridCommandLineStartup.java:293)
I have set ulimit -n 4096 but still no joy
The box has 64GB of memory - ample amount to run more then 2 instances of GridGain
Can anyone help with this error? are there any configuration changes i can make in Red Hat?
Thanks
Most likely you are running out of allowed number of user processes. We have encountered the same issue on our CentOs servers and setting ulimit -u 10240 helped.

Resources