502 Gitlab is taking too much time to respond - nginx

After taking gitlab backup everyday gitlab is throwing 502 error.
I saw nginx logs but did not find that much information.
After gitlab-ctl restart it starts working again.
System Configurations:
OS : Ubuntu 16.04 LTS
4 GB Ram
200 GB Disk Space
can anyone give permanent solution for it.

There is a high possibility that it run out of shared memory. As each time after the backup you got the 502 error.
To check it with gitlab-ctl tail tail detail
It will show something like:
2019-04-12_12:37:17.27154 FATAL: could not map anonymous shared memory: Cannot allocate memory
2019-04-12_12:37:17.27157 HINT: This error usually means that PostgreSQL's request for a shared memory segment exceeded available memory, swap space, or huge pages. To reduce the request size (currently 4345470976 bytes), reduce PostgreSQL's shared memory usage, perhaps by reducing shared_buffers or max_connections.
2019-04-12_12:37:17.27171 LOG: database system is shut down
Then check it with free -m, which shows there is no available shared memory.
total used free shared buffers cached
Mem: 16081 13715 2365 0 104 753
-/+ buffers/cache: 12857 3223
Then you need to check if there is some process take too many shared memory, or too many zomibe process, then kill it with command like ps -aef | grep ffmpeg | awk '{print $2}' | xargs kill 9
Check it with free -h, there is about 112M shared memory now.
total used free shared buffers cached
Mem: 15G 4.4G 11G 112M 46M 416M
-/+ buffers/cache: 3.9G 11G
Swap: 0B 0B 0B
At last,restart you gitlab with gitlab-ctl restart, after sometime the gitlab booted, the 502 gone.

After long search i got something about it. After taking backup my gitlab-workhorse is getting ideal and gitlab.socket is refusing the connection. As temporary solution i have installed a new cron job for restarting gitlab service after the complpetion of gitlab backup cronjob.

If the gitlab is installed in Virtual-Box - Ubuntu server either 18.04 or 20.04,
please increase the RAM to 4gb and the provide atleast 3 processors.

Related

Mysterious mariadb 10.4.1 ram ussage

i have upgraded from mariadb 10.1.36 to 10.4.8 and i can see mysterious increasing ram ussage on that new version. I also edited innodb_buffer_pool_size ant seems there is no effect if its set to 15M or 4G, ram is just slowly increasing. After while it eat whole ram and oom killer kills mariadb and this is repeating.
My server has 8GB RAM and its increasing like 60-150MB per day. Its not terrible but i have around 150 database servers so its huge problem.
I can temporary fix problem by restarting mariadb and its start again.
Info about database server:
databases: 200+
tables: 28200(141 per database)
average active connections: 100-200
size of stored data: 100-350GB
cpu: 4
ram: 8GB
there is my config:
server-id=101
datadir=/opt/mysql/
socket=/var/lib/mysql/mysql.sock
tmpdir=/tmp/
gtid-ignore-duplicates=True
log_bin=mysql-bin
expire_logs_days=4
wait_timeout=360
thread_cache_size=16
sql_mode="ALLOW_INVALID_DATES"
long_query_time=0.8
slow_query_log=1
slow_query_log_file=/opt/log/slow.log
log_output=TABLE
userstat = 1
user=mysql
symbolic-links=0
binlog_format=STATEMENT
default_storage_engine=InnoDB
slave_skip_errors=1062,1396,1690innodb_autoinc_lock_mode=2
innodb_buffer_pool_size=4G
innodb_buffer_pool_instances=5
innodb_log_file_size=1G
innodb_log_buffer_size=196M
innodb_flush_log_at_trx_commit=1
innodb_thread_concurrency=24
innodb_file_per_table
innodb_write_io_threads=24
innodb_read_io_threads=24
innodb_adaptive_flushing=1
innodb_purge_threads=5
innodb_adaptive_hash_index=64
innodb_flush_neighbors=0
innodb_flush_method=O_DIRECT
innodb_io_capacity=10000
innodb_io_capacity_max=16000
innodb_lru_scan_depth=1024
innodb_sort_buffer_size=32M
innodb_ft_cache_size=70M
innodb_ft_total_cache_size=1G
innodb_lock_wait_timeout=300
slave_parallel_threads=5
slave_parallel_mode=optimistic
slave_parallel_max_queued=10000000
log_slave_updates=on
performance_schema=on
skip-name-resolve
max_allowed_packet = 512M
query_cache_type=0
query_cache_size = 0
query_cache_limit = 1M
query_cache_min_res_unit=1K
max_connections = 1500
table_open_cache=64K
innodb_open_files=64K
table_definition_cache=64K
open_files_limit=1020000
collation-server = utf8_general_ci
character-set-server = utf8
log-error=/opt/log/error.log
log-error=/opt/log/error.log
pid-file=/var/run/mysqld/mysqld.pid
malloc-lib=/usr/lib64/libjemalloc.so.1
I solved it! The problem is memory allocation library.
If you do this SQL query:
SHOW VARIABLES LIKE 'version_malloc_library';
You must to get value "jemalloc" library. If you get only "system", you may have problems.
To change that, you need edit any .conf file in this directory:
/etc/systemd/system/mariadb.service.d/
There, add this line:
Environment="LD_PRELOAD=/usr/lib/x86_64-linux-gnu/libjemalloc.so.1"
(this library file may be in other folder)
Then you must to restart mysqld
service mysqld stop && systemctl daemon-reload && service mysqld start
You got carried away in increasing values in my.cnf.
Many of the caches grow until hitting their limit, hence the memory growth you experienced.
What is the value from SHOW GLOBAL STATUS LIKE 'Max_used_connections';? Having a large max_connections accentuates several of the other values; lower it.
But perhaps the really bad one(s) involve table caches -- which have units of tables, not bytes. Crank these down a lot:
table_open_cache=64K
innodb_open_files=64K
table_definition_cache=64K
I have exactly the same problem. Is it due to a bad configuration? Or is it a bug of the new version?
mariadb 10.1 was updated to 10.3 just when I upgraded Debian 9 to Debian 10. I tried solve the problem with mariadb 10.4 but nothing changed.
I want to downgrade version but I think it's neccesary dump all database and restore it, and that means being hours without service.
I don't think Debian 10 has to do with the issue
Please read my previous comments about alternative memory allocators...
When jemalloc is used:
When default memory allocator used:
Try with tcmalloc
Environment="LD_PRELOAD=/usr/lib/x86_64-linux-gnu/libtcmalloc_minimal.so.4.5.3.1"

EC2 with wordpress - everyday running out of space (no space left on device) - can't start apache

I'm having the strangest problem for days now. I took over a WordPress website of a company that was originally developed by another person – the codebase is a mess but I was able to go over it and make sure it at least is working.
The database is huge (70mb) and there is a lot of plugin dependencies on the site.
However the site works generally without issues now and I'm hosting it on an EC2 with a bitnami stack for WordPress.
The weird thing though is that everyday (for instance today morning) I check the site and it's down … 
Service Unavailable The server is temporarily unable to service your
request due to maintenance downtime or capacity problems. Please try
again later.
Additionally, a 503 Service Unavailable error was encountered while
trying to use an ErrorDocument to handle the request.
When logging into the server with ssh and trying to restart apache I get this:
Failed to unmonitor apache: write /var/lib/gonit/state: no space left
on device Syntax OK /opt/bitnami/apache2/scripts/ctl.sh : apache not
running Syntax OK (98)Address already in use: AH00072: make_sock:
could not bind to address [::]:80 (98)Address already in use: AH00072:
make_sock: could not bind to address 0.0.0.0:80 no listening sockets
available, shutting down AH00015: Unable to open logs
/opt/bitnami/apache2/scripts/ctl.sh : httpd could not be started
Failed to monitor apache: write /var/lib/gonit/state: no space left on
device
I had this the 3rd time in 3 days now even though I restored the server from a snapshot with a volume of 200gb (for testing purposes) and all site files including uploads only have 5gb.
The site is running on an EC2 (t2.medium) with 200gb volume now and today morning I can't restart apache. Yesterday evening when restoring from a snapshot the site works well and normal - it's actually even fast.
I don't know where to start investigating here. What could cause the server to run out of disc space in one night?
Thanks,
Matt
Also one of the weirdest things it seems. I reset everything yesterday eventing from an EC2 snapshot to a 200gb volume and attached it to the instance. Everything was working fine. I made some changes on the files, deleted some plugins, updated some settings.
And it seems this is all gone now. And I'm using an elastic IP, so I couldn't connect to a wrong device or something.
Bitnami Engineer here, you will probably need to resize the disk of your instance. But you can investigate those issues later, these commands will show the directories with large number of files
cd /opt/bitnami
sudo find . -type f | cut -d "/" -f 2 | sort | uniq -c | sort -n
du -h -d 1
If MySQL is the service that is taking more space, you can try adding this line under the [mysqld] block of the /opt/bitnami/mysql/my.cnf configuration file
expire_logs_days = 7
That will force MySQL to purge the old logs of the server after 7 days. You will need to restart MySQL after that:
sudo /opt/bitnami/ctlscript.sh restart mysql
More information here:
https://community.bitnami.com/t/something-taking-up-space-and-growing/64532/7
What you need to do is increase the size of partition on the disk and the size of file system on that partition. Even you increased the volume size, these figure kept unchanged. Create another from snapshot would not help too.
Check how to do it here: https://docs.aws.amazon.com/AWSEC2/latest/UserGuide/recognize-expanded-volume-linux.html
Your df result shows
Filesystem 1K—blocks Used Available Use% Mounted on
udev 2014560 0 2014560 0% /dev
tmpfs 404496 5872 398624 2% /run
/dev/xvdal 20263528 20119048 128096 100%
tmpfs 2022464 0 2022464 0% /dev/shm
tmpfs 5120 0 5120 0% /run/lock
tmpfs 2022464 0 2022464 0% /sys/fs/cgroup
/dev/loop0 18432 18432 0 100% /snap/amazon—ssm—agent/1480
/dev/loopl 91264 91264 0 100% /snap/core/7713
/dev/loop2 12928 12928 0 100% /snap/amazon—ssm—agent/295
/dev/loop3 91264 91264 0 100% /snap/core/7917
tmpfs 404496 0 404496 0% /run/user/1000
where the root volume /deb/xvda1 has only 20GB and that is marked as 100% of the volume, not 200GB as you mentioned.
When you increase the volume size during the instance running, it is not automatically applied. In your EC2, you have to apply the change of volume as follows:
sudo resize2fs /dev/xvda1
and check the size of the volume by doing df -h then you will see the size is now 200GB.

Openstack Instance does not use the entire hard disk

I created new vm instance using "Ubuntu Server 10.04 LTS (Lucid Lynx) - 32 bits" image and m1.small falvour which has 20 GB Disk (OpenStack Icehouse). When i logging to the vm and run df -h , I found that the VM does not use the entire assigned HD. The command results are shown as the following:
Filesystem Size Used Avail Use% Mounted on
/dev/vda1 1.4G 595M 721M 46% /
none 1005M 144K 1005M 1% /dev
none 1007M 0 1007M 0% /dev/shm
none 1007M 36K 1007M 1% /var/run
none 1007M 0 1007M 0% /var/lock
none 1007M 0 1007M 0% /lib/init/rw
The "fdisk -l" shows the DH size is 20 GB:
Disk /dev/vda: 21.5 GB, 21474836480 bytes
4 heads, 32 sectors/track, 327680 cylinders
Units = cylinders of 128 * 512 = 65536 bytes
Sector size (logical/physical): 512 bytes / 512 bytes
I/O size (minimum/optimal): 512 bytes / 512 bytes
Disk identifier: 0x000cb9da
Device Boot Start End Blocks Id System
/dev/vda1 * 17 32768 2096128 83 Linux
I need the vm to take the full space assigned to it. Any idea how could I fix it? I want the solution to be applied on each vm I create, so I do not want to manually update the VM after instantiation. I also must use 10.04 image ( can not upgrdate to 14.04)
The problem here is the image. I grabbed that one and ran it up, it's pretty simple to run a
sudo resize2fs /dev/vda1
which will resize the filesystem to the size of the partition, which seems to be 2GB. Beyond that, you have to increase the partition size. For that I think you're probably best off using virt-resize, there are some good howto's out there e.g. askubuntu, in essence:
SSH into your openstack controller node
source keystonerc_admin (or whatever yours may be called)
nova list --all-tenants | grep <instance_name> or just grab the server guid from horizon
nova show <server_guid> and note which nova host your machine is running on. Also note the instance name (e.g. instance-00000adb)
SSH into that nova node
virsh dumpxml instance-00000adb and look for the image file. On mine, this is /var/lib/nova/instances/<server_guid>/disk but that may not always be the case?
yum install libguestfs-tools
truncate -r /var/lib/nova/instances/d887249a-0d95-473e-b4f2-41f71df4dbb5/disk /var/lib/nova/instances/d887249a-0d95-473e-b4f2-41f71df4dbb5/disk.new
truncate -s +2G /var/lib/nova/instances/d887249a-0d95-473e-b4f2-41f71df4dbb5/disk.new
virt-resize --expand /dev/sda1 /var/lib/nova/instances/d887249a-0d95-473e-b4f2-41f71df4dbb5/disk /var/lib/nova/instances/d887249a-0d95-473e-b4f2-41f71df4dbb5/disk.new
mv disk disk.old ; mv disk.new disk
NB - mine didn't quite work when I booted that up again, not got time to investigate yet but it can't be far off that, and hopefully this helps.
Once you've managed to boot that up again, then you can shut it down and create a snapshot from horizon. You can then use that snapshot just like any other image, and launch all subsequent VMs directly from there.
HTH.

GridGain Out of Memory Exception: Unable to create new native thread

I'm trying to create more then 2 instances of Grid Gain (Just by running the shell script) in Red Hat Release 6.5 (Santiago), but i get the following error when i try to run the shell script a 3rd time:
java.lang.OutOfMemoryError: unable to create new native thread
at java.lang.Thread.start0(Native Method)
at java.lang.Thread.start(Thread.java:714)
at java.util.concurrent.ThreadPoolExecutor.addWorker(ThreadPoolExecutor.java:949)
at java.util.concurrent.ThreadPoolExecutor.prestartAllCoreThreads(ThreadPoolExecutor.java:1604)
at org.gridgain.grid.kernal.GridGainEx$GridNamedInstance.start0(GridGainEx.java:1507)
at org.gridgain.grid.kernal.GridGainEx$GridNamedInstance.start(GridGainEx.java:1289)
at org.gridgain.grid.kernal.GridGainEx.start0(GridGainEx.java:832)
at org.gridgain.grid.kernal.GridGainEx.start(GridGainEx.java:759)
at org.gridgain.grid.kernal.GridGainEx.start(GridGainEx.java:677)
at org.gridgain.grid.kernal.GridGainEx.start(GridGainEx.java:524)
at org.gridgain.grid.kernal.GridGainEx.start(GridGainEx.java:494)
at org.gridgain.grid.GridGain.start(GridGain.java:314)
at org.gridgain.grid.startup.cmdline.GridCommandLineStartup.main(GridCommandLineStartup.java:293)
I have set ulimit -n 4096 but still no joy
The box has 64GB of memory - ample amount to run more then 2 instances of GridGain
Can anyone help with this error? are there any configuration changes i can make in Red Hat?
Thanks
Most likely you are running out of allowed number of user processes. We have encountered the same issue on our CentOs servers and setting ulimit -u 10240 helped.

Error establishing a database connection EC2 Amazon

I hope you can help me. I can not stand having to keep restarting my ec2 instance on Amazon.
I have two wordpress sites hosted there. My sites have always worked well until two months ago, one of them started having this problem. I tried all ways pack up, and the only solution was to reconfigure.
Now that all was right with the two. The second site started the same problem. I think Amazon is clowning me.
I am using a free micro instance. If anyone knows what the problem is, please help me!
Your issue will be the limited memory that is allocated to the T1 Micro instances in EC2. I'm assuming you are using ANI Linux in this case and if an alternate version of Linux is used then you may have different locations for your log and config files.
Make sure you are the root user.
Have a look at your MySQL logs in the following location:
/var/log/mysqld.log
If you see repeated instances of the following it's pretty certain that the 0.6GB of memory allocated to the micro instance is not cutting it.
150714 22:13:33 InnoDB: Initializing buffer pool, size = 12.0M
InnoDB: mmap(12877824 bytes) failed; errno 12
150714 22:13:33 InnoDB: Completed initialization of buffer pool
150714 22:13:33 InnoDB: Fatal error: cannot allocate memory for the buffer pool
150714 22:13:33 [ERROR] Plugin 'InnoDB' init function returned error.
150714 22:13:33 [ERROR] Plugin 'InnoDB' registration as a STORAGE ENGINE failed.
150714 22:13:33 [ERROR] Unknown/unsupported storage engine: InnoDB
150714 22:13:33 [ERROR] Aborting
You will notice in the log excerpt above that my buffer pool size is set to 12MB. This can be configured by adding the line innodb_buffer_pool_size = 12M to your MySQL config file /etc/my.cnf.
A pretty good way to deal with InnoDB chewing up your memory is to create a swap file.
Start by checking the status of your memory:
free -m
You will most probably see that your swap is not doing much:
total used free shared buffers cached
Mem: 592 574 17 0 15 235
-/+ buffers/cache: 323 268
Swap: 0 0 0
To start ensure you are logged in as the root user and run the following command:
dd if=/dev/zero of=/swapfile bs=1M count=1024
Wait for a bit as the command is not verbose but you should see the following response after about 15 seconds when the process is complete:
1024+0 records in
1024+0 records out
1073741824 bytes (1.1 GB) copied, 31.505 s, 34.1 MB/s
Next set up the swapspace with:
mkswap /swapfile
Now set up the swap event:
swapon /swapfile
If you get a permissions response you can ignore it or address the swap file by changing the permissions to 600 with the chmod command.
chmod 600 /swapfile
Now add the following line to /etc/fstab to create the swap spaces on server start:
/swapfile swap swap defaults 0 0
Restart your MySQL instance:
service mysqld restart
Finally check to see if your swap file is working correctly with the free -m command.
You should see something like:
total used free shared buffers cached
Mem: 592 575 16 0 16 235
-/+ buffers/cache: 323 269
Swap: 1023 0 1023
Hope this helps.

Resources