Open stack Magnum docker private repo - openstack

I am working in Open stack Magnum, and resent days I am facing issue in deploying magnum , due to docker pulling rate limit Magnum is not pulling the docker containers. Please help me how to point private repo to mangum to avoid this pulling rate.
Hi Team,
I am working in Open stack Magnum, and resent days I am facing issue in deploying magnum , due to docker pulling rate limit Magnum is not pulling the docker containers. Please help me how to point private repo to mangum to avoid this pulling rate.

Related

Artifactory pro v7.30.x fails to start (multiple versions and installation methods)

I am evaluating a self-hosted artifactory installation on a trial license. I followed the official installation instructions for the docker container and the linux archive file. Neither of these installation options are working. The artifactory service fails to start.
I have opened an issue to track the problem: https://www.jfrog.com/jira/browse/RTFACT-27182
TL;DR; A component fails, a nasty stack trace appears in the logs, and eventually the services stop.
It would seem that there is a bug in artifactory. I have traced this back to multiple versions and this issue spans multiple years.
The problem appears to be that artifactory cannot get past the bootstrapping/initialization phase when started with artifactoryctl. At a certain point (around 2-5 minutes in) all the services stop and a pid file is left over, which is bad.
The workaround I have found is that the service can pass this initialization phase only after multiple start/stops (3 to be exact). In other words, we call artifactoryctl start, wait for all failures, then artifactoryctl stop and repeat two more times. On the fourth and final start, we will see the service come online (in about 150 - 190s). From then on, the service will start correctly with one call to artifactoryctl start.
I have not yet looked at the systemd unit file. My guess would be that it has/or could be made to have a number of retries to work around this issue and perhapse this issue does not exist when using the service wrapper.
I have also not yet looked again at the docker container which appears to be failing for the same reason. A workaround off the top of my head would be to modify the entrypoint script. If you were to dockerk exec into the container and try the workaround above it would likely terminate the root process and kill the container.

How did my Artifactory generic and docker repos suddenly change type/version?

We have been running Artifactory (currently version 6.9.0) in EC2 for months now with no problems. This was originally a licensed instance of Enterprise Artifactory that we let lapse (intentionally).
Last week we started getting a storage warning (we use cluster-s3 storage) that we were at 95% utilization (which disables uploads) so we started cleaning up old artifacts (i.e., binaries, Docker images) to get the storage down. We got it down for a while, but it crept back up -- high enough this time that we couldn't ssh in, so we rebooted the machine via the EC2 Console.
It came right back with no obvious problems. Then we deleted a generic repository that someone had set up as a back up of another system (300GB) which bought us back plenty of space.
Today, a number of our builds started failing because the step to push the artifact to Artifactory failed. Upon further investigation, a number of our "generic" repositories are now appearing (and behaving) as "Docker" repositories. Further, a number of our v1 Docker repositories are now reporting as v2 Docker repos and blocking standard pushes from v1 clients.
The docs are pretty clear that we can't change the repo type, and I'm not seeing a way to migrate back to v1 from v2 Docker repos. I'm currently exporting one of the repos to see if we can import it as the right type.
Any idea what happened here? Did something get corrupted in the database? What can I even start to check?

Kolla-ansible too many open files

I am having an issue with a relatively small openstack cluster deployed with kolla-ansible. The issue is that after a few days the controllers stop working. When I go into the docker container logs, I see in all of them that there are Too Many Open Files. I have tried changing limits.conf sysctl max files for processes and user. After all of that, the issue still shows up.
One interesting thing is that this was not happening until I had to reboot all of the controllers. I rebooted them because I needed to increase the amount of ram that they have after they died swapping. My first thought was that kolla-ansible is setting a configuration after running deploy, but I can't seem to find any point in the repo when kolla-ansible is changing ulimits or other.
Any theories what could cause this? Would it be related to increasing ram? Should I run reconfigure/deploy on each controller? I've tried looking in kolla-ansible's docs and forums and couldn't see where anyone else was having this issue.
Update this hasn't been fixed yet:
I submitted a bug report, https://bugs.launchpad.net/kolla-ansible/+bug/1901898
I don't know your used versions of Kolla-Ansible and your Linux, but your problem seems really related to this one:
On Ubuntu 16.04, please uninstall lxd and lxc packages. (An issue exists with cgroup mounts, mounts exponentially increasing when restarting container) (source: docs.openstack.org/kolla-ansible/4.0.0/quickstart.html)
I had this problem with the exponentially growing number of mount-pointers after the restart of my docker-containers too. My single-node test-deployment had become very slow based on this problem, but I can't remember at the moment, that I would had the same error with too many open files.
You can delete the packages with apt-get remove lxc-common lxcfs lxd lxd-client. I had done this fix together with a complete reinstallation of the kolla-ansible installation, so I don't know, if this also helps with an already existing installation. You should also use docker-ce instead of the docker from the apt-repos.
This was fixed with a workaround in bug https://bugs.launchpad.net/keystonemiddleware/+bug/1883659 problem was neutron server was keeping memcached connections open and not closing them until the memcached container reached too many files open. There is a work around mentioned in the bug link.

openstack how to prevent losing vms

I am using "devstack" to play with the openstack in my desktop.
I had configured several vms in my instance. What happened was couple of days ago there was a power failure which caused my desktop to power down(I didnt have a UPS) attached to it. This resulted in my losing all the vms since i didnt unstack.
One of the solution to prevent this from happening next time is using a UPS. Are there any other solutions that I can use to back the vms so that even if there is a power loss the vms will run if i just restart and do ./stack.sh
Create snapshot of VM
Instance snapshots are uploaded to Glance which will store them in /var/lib/glance/images on the controller node.
Backup this folder.
When there is a data lose occurs , just restore this folder and Launch new instance by boot from image. select the snapshot and click launch.
Devstack is a developer environment, it is not meant to recover from power losses.
You should consider using another all-in-one openstack installer which should support restarting the openstack services without losing state. For instance, you can use Redhat's packstack - https://openstack.redhat.com/Quickstart

Zookeeper - upgrade from standalone to quorum

Currently I have a standalone ZK instances used in a test system.
But this test system has become production system and i would like to upgrade from 1 ZK instance to 3 without compromising availability of the SolrCloud system that ZK is overseeing.
From what i've read upgrading from 3 to 5 and so on is pretty easy using rolling restarts, but haven't found any info on going from standalone (1 instance) to 3.
Does anyone have any insight on this (anyone who might have tried it)?
Thanks!
I managed to do it after all by stopping node 1, updating configuration to know of all other servers and then creating the myid files for each node.
Then i restarted node 1 and started node 2,3,4 and everything went just fine.

Resources