Openstack Instance does not use the entire hard disk - openstack

I created new vm instance using "Ubuntu Server 10.04 LTS (Lucid Lynx) - 32 bits" image and m1.small falvour which has 20 GB Disk (OpenStack Icehouse). When i logging to the vm and run df -h , I found that the VM does not use the entire assigned HD. The command results are shown as the following:
Filesystem Size Used Avail Use% Mounted on
/dev/vda1 1.4G 595M 721M 46% /
none 1005M 144K 1005M 1% /dev
none 1007M 0 1007M 0% /dev/shm
none 1007M 36K 1007M 1% /var/run
none 1007M 0 1007M 0% /var/lock
none 1007M 0 1007M 0% /lib/init/rw
The "fdisk -l" shows the DH size is 20 GB:
Disk /dev/vda: 21.5 GB, 21474836480 bytes
4 heads, 32 sectors/track, 327680 cylinders
Units = cylinders of 128 * 512 = 65536 bytes
Sector size (logical/physical): 512 bytes / 512 bytes
I/O size (minimum/optimal): 512 bytes / 512 bytes
Disk identifier: 0x000cb9da
Device Boot Start End Blocks Id System
/dev/vda1 * 17 32768 2096128 83 Linux
I need the vm to take the full space assigned to it. Any idea how could I fix it? I want the solution to be applied on each vm I create, so I do not want to manually update the VM after instantiation. I also must use 10.04 image ( can not upgrdate to 14.04)

The problem here is the image. I grabbed that one and ran it up, it's pretty simple to run a
sudo resize2fs /dev/vda1
which will resize the filesystem to the size of the partition, which seems to be 2GB. Beyond that, you have to increase the partition size. For that I think you're probably best off using virt-resize, there are some good howto's out there e.g. askubuntu, in essence:
SSH into your openstack controller node
source keystonerc_admin (or whatever yours may be called)
nova list --all-tenants | grep <instance_name> or just grab the server guid from horizon
nova show <server_guid> and note which nova host your machine is running on. Also note the instance name (e.g. instance-00000adb)
SSH into that nova node
virsh dumpxml instance-00000adb and look for the image file. On mine, this is /var/lib/nova/instances/<server_guid>/disk but that may not always be the case?
yum install libguestfs-tools
truncate -r /var/lib/nova/instances/d887249a-0d95-473e-b4f2-41f71df4dbb5/disk /var/lib/nova/instances/d887249a-0d95-473e-b4f2-41f71df4dbb5/disk.new
truncate -s +2G /var/lib/nova/instances/d887249a-0d95-473e-b4f2-41f71df4dbb5/disk.new
virt-resize --expand /dev/sda1 /var/lib/nova/instances/d887249a-0d95-473e-b4f2-41f71df4dbb5/disk /var/lib/nova/instances/d887249a-0d95-473e-b4f2-41f71df4dbb5/disk.new
mv disk disk.old ; mv disk.new disk
NB - mine didn't quite work when I booted that up again, not got time to investigate yet but it can't be far off that, and hopefully this helps.
Once you've managed to boot that up again, then you can shut it down and create a snapshot from horizon. You can then use that snapshot just like any other image, and launch all subsequent VMs directly from there.
HTH.

Related

EC2 with wordpress - everyday running out of space (no space left on device) - can't start apache

I'm having the strangest problem for days now. I took over a WordPress website of a company that was originally developed by another person – the codebase is a mess but I was able to go over it and make sure it at least is working.
The database is huge (70mb) and there is a lot of plugin dependencies on the site.
However the site works generally without issues now and I'm hosting it on an EC2 with a bitnami stack for WordPress.
The weird thing though is that everyday (for instance today morning) I check the site and it's down … 
Service Unavailable The server is temporarily unable to service your
request due to maintenance downtime or capacity problems. Please try
again later.
Additionally, a 503 Service Unavailable error was encountered while
trying to use an ErrorDocument to handle the request.
When logging into the server with ssh and trying to restart apache I get this:
Failed to unmonitor apache: write /var/lib/gonit/state: no space left
on device Syntax OK /opt/bitnami/apache2/scripts/ctl.sh : apache not
running Syntax OK (98)Address already in use: AH00072: make_sock:
could not bind to address [::]:80 (98)Address already in use: AH00072:
make_sock: could not bind to address 0.0.0.0:80 no listening sockets
available, shutting down AH00015: Unable to open logs
/opt/bitnami/apache2/scripts/ctl.sh : httpd could not be started
Failed to monitor apache: write /var/lib/gonit/state: no space left on
device
I had this the 3rd time in 3 days now even though I restored the server from a snapshot with a volume of 200gb (for testing purposes) and all site files including uploads only have 5gb.
The site is running on an EC2 (t2.medium) with 200gb volume now and today morning I can't restart apache. Yesterday evening when restoring from a snapshot the site works well and normal - it's actually even fast.
I don't know where to start investigating here. What could cause the server to run out of disc space in one night?
Thanks,
Matt
Also one of the weirdest things it seems. I reset everything yesterday eventing from an EC2 snapshot to a 200gb volume and attached it to the instance. Everything was working fine. I made some changes on the files, deleted some plugins, updated some settings.
And it seems this is all gone now. And I'm using an elastic IP, so I couldn't connect to a wrong device or something.
Bitnami Engineer here, you will probably need to resize the disk of your instance. But you can investigate those issues later, these commands will show the directories with large number of files
cd /opt/bitnami
sudo find . -type f | cut -d "/" -f 2 | sort | uniq -c | sort -n
du -h -d 1
If MySQL is the service that is taking more space, you can try adding this line under the [mysqld] block of the /opt/bitnami/mysql/my.cnf configuration file
expire_logs_days = 7
That will force MySQL to purge the old logs of the server after 7 days. You will need to restart MySQL after that:
sudo /opt/bitnami/ctlscript.sh restart mysql
More information here:
https://community.bitnami.com/t/something-taking-up-space-and-growing/64532/7
What you need to do is increase the size of partition on the disk and the size of file system on that partition. Even you increased the volume size, these figure kept unchanged. Create another from snapshot would not help too.
Check how to do it here: https://docs.aws.amazon.com/AWSEC2/latest/UserGuide/recognize-expanded-volume-linux.html
Your df result shows
Filesystem 1K—blocks Used Available Use% Mounted on
udev 2014560 0 2014560 0% /dev
tmpfs 404496 5872 398624 2% /run
/dev/xvdal 20263528 20119048 128096 100%
tmpfs 2022464 0 2022464 0% /dev/shm
tmpfs 5120 0 5120 0% /run/lock
tmpfs 2022464 0 2022464 0% /sys/fs/cgroup
/dev/loop0 18432 18432 0 100% /snap/amazon—ssm—agent/1480
/dev/loopl 91264 91264 0 100% /snap/core/7713
/dev/loop2 12928 12928 0 100% /snap/amazon—ssm—agent/295
/dev/loop3 91264 91264 0 100% /snap/core/7917
tmpfs 404496 0 404496 0% /run/user/1000
where the root volume /deb/xvda1 has only 20GB and that is marked as 100% of the volume, not 200GB as you mentioned.
When you increase the volume size during the instance running, it is not automatically applied. In your EC2, you have to apply the change of volume as follows:
sudo resize2fs /dev/xvda1
and check the size of the volume by doing df -h then you will see the size is now 200GB.

Why does Openstack Swift services put all its data/files in root and not my specified partition?

I deployed using kolla-ansible 5.0.0.
I used fdisk to create a new xfs sda4 primary partition with size of 1.7 TB and then I created the rings following this documentation for kolla-ansible:
https://github.com/openstack/kolla-ansible/blob/master/doc/source/reference/swift-guide.rst
After I deployed, swift seems to work fine. However /dev/sda4 is not mounted to /srv/node/sda4 and all of swift's files or data gets put in root.
output of fdisk -l showing my sda4 disk partition I want swift to use:
[root#openstackstorage1 swift]# fdisk -l
Disk /dev/sda: 1999.8 GB, 1999844147200 bytes, 3905945600 sectors
Units = sectors of 1 * 512 = 512 bytes
Sector size (logical/physical): 512 bytes / 512 bytes
I/O size (minimum/optimal): 512 bytes / 512 bytes
Disk label type: dos
Disk identifier: 0x000c22f6
Device Boot Start End Blocks Id System
/dev/sda1 * 2048 718847 358400 83 Linux
/dev/sda2 718848 2815999 1048576 82 Linux swap / Solaris
/dev/sda3 2816000 209663999 103424000 8e Linux LVM
/dev/sda4 209664000 3905945599 1848140800 83 Linux
WARNING: fdisk GPT support is currently new, and therefore in an experimental phase. Use at your own discretion.
output of df -h
Filesystem Size Used Avail Use% Mounted on
/dev/mapper/rootvg01-lv_root 98G 3.4G 95G 4% /
devtmpfs 3.9G 0 3.9G 0% /dev
tmpfs 3.9G 0 3.9G 0% /dev/shm
tmpfs 3.9G 9.0M 3.9G 1% /run
tmpfs 3.9G 0 3.9G 0% /sys/fs/cgroup
/dev/mapper/openstackvg01-lv_openstackstorage 2.8T 75G 2.7T 3% /var/lib/docker
/dev/sda1 347M 183M 165M 53% /boot
tmpfs 782M 0 782M 0% /run/user/0
this output of df -h /srv/node/sda4 shows a logical volume of root disk is mounted on /srv/node/sda4.
[root#openstackstorage1 swift]# df -h /srv/node/sda4/
Filesystem Size Used Avail Use% Mounted on
/dev/mapper/rootvg01-lv_root 98G 3.4G 95G 4% /
but shouldn't my /dev/sda4 partition I made be mount to /srv/node/sda4 ?
Not sure what I did wrong and need guidance please
The reason this was not working is cause my /dev/sda4 was not an xfs filesystem......I just had to run mkfs.xfs –f –I size=1024 –L sda4 /dev/sda4 on my partition I created and then I mount it myself mount -t xfs -L sda4 /srv/node/sda4
I then had to restart all swift services and now all swift files and data are being stored in /srv/node/sda4 where /dev/sda4 is mounted.

There is not enough space available in tmpfs docker container

I'm running a docker container that seems to have insufficient memory and I'm not sure how to solve this problem.
I'm essentially running a program on this docker container that downloads an image into tmpfs, performs some operations, deletes the the image and returns a result. However, it seems like I'm running into images that are too large to store in my current docker tmpfs. Below is the output of the linux df command while inside the container:
Filesystem Size Used Avail Use% Mounted on
overlay 63G 11G 50G 18% /
tmpfs 64M 0 64M 0% /dev
tmpfs 6.9G 0 6.9G 0% /sys/fs/cgroup
/dev/sda1 63G 11G 50G 18% /etc/hosts
shm 64M 4.0K 64M 1% /dev/shm
tmpfs 6.9G 0 6.9G 0% /sys/firmware
I've tried expanding docker's memory (hence the huge values on two of the tmpfs's) but I'm still running into this problem.
I guess I have a couple of questions:
1) what is the difference between the 3 separate tmpfs filesystems? Why do they exist?
2) Presumably I need to expand the first tmpfs size (the small one) -- how would I go about doing that?
Finally, some relevant system information:
OS -- OSX
Docker version -- Docker version 17.09.0-ce, build afdb6d4
Let me know if there's other stuff you need to know!
Thanks everyone.
Okay, ultimately figured out the answer. My original two questions were kind of off base.
Essentially, my docker instance didn't have enough memory -- the tmpfs files were red herrings. I ended up needing to pass in a --shm-size="4096m" argument to my docker start command (increased memory to 4096 megabytes) in order to allow my function to properly execute. Hope this helps someone down the road!
Also, for google purpose, the exact error I was getting was There is not enough space available on the shmfs/tmpfs file system. relating to Abbyy FineReader
If you are using Kubernetes, then you need sufficient space in /dev/shm.
In my case, there wasnt enough space in /dev/shm hence Abbyy would bail out before extracting Meta-images. After giving /dev/shm a volume mount, it worked fine. Hope this helps!

502 Gitlab is taking too much time to respond

After taking gitlab backup everyday gitlab is throwing 502 error.
I saw nginx logs but did not find that much information.
After gitlab-ctl restart it starts working again.
System Configurations:
OS : Ubuntu 16.04 LTS
4 GB Ram
200 GB Disk Space
can anyone give permanent solution for it.
There is a high possibility that it run out of shared memory. As each time after the backup you got the 502 error.
To check it with gitlab-ctl tail tail detail
It will show something like:
2019-04-12_12:37:17.27154 FATAL: could not map anonymous shared memory: Cannot allocate memory
2019-04-12_12:37:17.27157 HINT: This error usually means that PostgreSQL's request for a shared memory segment exceeded available memory, swap space, or huge pages. To reduce the request size (currently 4345470976 bytes), reduce PostgreSQL's shared memory usage, perhaps by reducing shared_buffers or max_connections.
2019-04-12_12:37:17.27171 LOG: database system is shut down
Then check it with free -m, which shows there is no available shared memory.
total used free shared buffers cached
Mem: 16081 13715 2365 0 104 753
-/+ buffers/cache: 12857 3223
Then you need to check if there is some process take too many shared memory, or too many zomibe process, then kill it with command like ps -aef | grep ffmpeg | awk '{print $2}' | xargs kill 9
Check it with free -h, there is about 112M shared memory now.
total used free shared buffers cached
Mem: 15G 4.4G 11G 112M 46M 416M
-/+ buffers/cache: 3.9G 11G
Swap: 0B 0B 0B
At last,restart you gitlab with gitlab-ctl restart, after sometime the gitlab booted, the 502 gone.
After long search i got something about it. After taking backup my gitlab-workhorse is getting ideal and gitlab.socket is refusing the connection. As temporary solution i have installed a new cron job for restarting gitlab service after the complpetion of gitlab backup cronjob.
If the gitlab is installed in Virtual-Box - Ubuntu server either 18.04 or 20.04,
please increase the RAM to 4gb and the provide atleast 3 processors.

Not enough space when running Hudson jobs

My Hudson jobs are crashing on each run with this error:
Caused by: java.io.IOException: error=12, Not enough space
at java.lang.UNIXProcess.forkAndExec(Native Method)
I found documention on StackOverflow and on the Jenkins website regarding this error, which indicate a problem of swap space (https://wiki.jenkins-ci.org/display/JENKINS/IOException+Not+enough+space).
However, maybe my problem is different or not, but if I launch the process manually it works fine.
A weird thing is I see different resuls from top of from prstat:
Specs:
Hudson processes are running in their own Unix user
OS: SunOS dc5c00-d12 5.10 Generic_147440-19 sun4v sparc sun4v
Memory:
from top:
32G phys mem, 6255M free mem, 16G total swap, 16G free swap
from prstat
NPROC USERNAME SWAP RSS MEMORY TIME CPU
50 user1 12G 12G 39% 89:02:31 0.3%
36 user2 11G 6779M 21% 155:17:41 0.0%
26 user3 10G 8509M 26% 4787:37:4 8.0%
6 hudson 572M 556M 1.7% 0:08:25 0.0%
57 root 280M 285M 0.9% 138:46:05 0.0%
Can anywone confirm if I have a swap issue? top shows 16GB free...
EDIT:
results from swap -s (after problem being remporarly resolved)
total: 19940168k bytes allocated + 12578048k reserved = 32518216k used, 4118208k available
.
It is certainly a swap issue.
top is reporting as free swap blocks that do not contain paginated data. However, even while unused, some of these blocks can be reserved (i.e untouched still allocated virtual memory). When you have no more blocks to back memory reservations, you got this "Not enough space" exception.
swap -s shows your applications are reserving more that 12 GB while your swap area is just 16 GB. I would double the size of your swap to prevent virtual memory shortage in your case.

Resources