When installing openstack via kolla-ansible you specify openstack version in globals.yml, ie: openstack_release: "victoria". This is as specific as you can get, there are no point-in-time tags, just a moving target like "victoria".
In my experience containers are updated randomly, not all-at-once, and frequently. Every time I rebuild I'm having to wait for docker to pull down things which have changed since my last deploy. This is problematic for multiple reasons, most acutely:
This is a fast-moving community-driven project. I'm having to work through new issues every few times I rebuild as a result of changes.
If I deploy onto one set of hosts, then deploy onto more hosts hours later, I'm waiting again on updates, and my stack is running containers of different versions.
These pulls take time and make my deployments vulnerable to timeouts and network problems.
To emphasize what a problem the second issue is, usually I can reset a failed deployment and try again, but not always. There have been times where I had residual issues, and due to my noobness it was quicker to dump fresh disks and start over. I'm using external ceph (the only ceph option in kolla-ansible:victoria), colocated with the compute nodes. Resetting pool / OSD state to an earlier point in time isn't in my toolbox yet, so I also wipe my OSD's and redo the ceph installation. I can pin version on ceph containers, but I start to sweat once the kolla-ansible installation starts. For a 4-hour total install, there's a not-small chance that another container will change in this time.
The obvious answer for anybody who does IT or software professionally is to pin my kolla:* container versions to a specific point-in-time tag, and not "victoria". I could pin each container to a digest, but that's not supported in the playbooks as written. I'd need to edit ansible playboooks and add a variable for every container that I want to pin. And then maintain that logic as new containers are added. I'm pulling 43 containers right now. This approach feels like "2 trailer park girls go 'round the outside".
A far simpler approach which I'm planning is to pull all the "victoria"-tagged containers, and then iterate through pushing them back into my own docker repo (eg, "victoria-feralcoder-20120321"), and then update globals.yml to use this stable tag. I'm new to managing my own docker repos, so I don't know if I can retag images in a pull-through cache, or if I need to set up a private repo for that, so I may also have to switch kolla-ansible between docker.io and a private feralcoder repo, depending on whether I want to do a latest-pull or a pinned-pull. That would be a little "hey nineteen", cleaner and nicer, still not quite right...
I feel like this pull-retag-push-reconfigure-redeploy approach is hack jankery. Does anybody have a better suggestion? Like, to not check upstream for container changes if there's already a tag-match in the local mirror? Or maybe a way to pull-thru-and-retag, at the registry level?
Thanks, in advance, and also thanks to the kolla-ansible contributors for all their work, short of not providing version stability.
Here is one answer, for an existing deployment:
If you have already pulled containers to all your hosts, you can edit some ansible or python so that docker_container.pull=false for all containers.
This is the implementing module:
.../lib/python3.6/site-packages/ansible/modules/cloud/docker/docker_container.py.
This file might be in /usr/local/share/kolla-ansible/, or .../venvs/kolla-ansible/. When false, if the container exists on the host it won't be repulled.
This doesn't help the situation where a host hasn't yet pulled the package and you have a version already in your local mirror. In that situation, the stack host will pull the container, and your pull-through cache will pull down any container updates since last pull.
This is my current preferred solution, which is still, admittedly, a hack:
Pull the latest images as a batch, then tag them and push them to a local registry.
First, I need 2 docker registries: I can't push to a pull-through cache, so I also needed to set up a private registry, which I can push to.
I need to toggle settings in globals.yml back and forth during kolla-ansible deploy to achieve this:
When I run "kolla-ansible bootstrap-servers" I need the local registry configured, so that stack hosts are configured with appropriate insecure-registries configs.
I use "kolla-ansible pull" to prefetch the latest packages, when I want to update. For this I reconfigure globals.yml to point at kolla/*:victoria.
After I fetch the latest containers, I run a loop on one of my stack hosts to pull them from my pull-through cache, tag them to my local registry with a date stamp tag, and push them to my local registry.
Before I run the actual deploy I configure globals.yml to use my local registry and tags.
These are the globals.yml settings of interest:
## PINNED CONTAINER VERSIONS
#docker_registry: 192.168.127.220:4001
#docker_namespace: "feralcoder"
#openstack_release: "feralcoder-20210321"
# LATEST CONTAINER VERSIONS
docker_registry:
docker_registry_username: feralcoder
docker_namespace: "kolla"
openstack_release: "victoria"
My pseudocode is like this (intermediate steps pruned...):
use_localized_containers () {
cp $KOLLA_SETUP_DIR/files/kolla-globals-localpull.yml /etc/kolla/globals.yml
cat $KOLLA_SETUP_DIR/files/kolla-globals-remainder.yml >> /etc/kolla/globals.yml
}
use_latest_dockerhub_containers () {
# We switch to dockerhub container fetches, to get the latest "victoria" containers
cp $KOLLA_SETUP_DIR/files/kolla-globals-dockerpull.yml /etc/kolla/globals.yml
cat $KOLLA_SETUP_DIR/files/kolla-globals-remainder.yml >> /etc/kolla/globals.yml
}
localize_latest_containers () {
for CONTAINER in `ls $KOLLA_PULL_THRU_CACHE`; do
ssh_control_run_as_user root "docker image pull kolla/$CONTAINER:victoria" $PULL_HOST
ssh_control_run_as_user root "docker image tag kolla/$CONTAINER:victoria $LOCAL_REGISTRY/feralcoder/$CONTAINER:$TAG" $PULL_HOST
ssh_control_run_as_user root "docker image push $LOCAL_REGISTRY/feralcoder/$CONTAINER:$TAG" $PULL_HOST
done
}
use_localized_containers
kolla-ansible -i $INVENTORY bootstrap-servers
use_latest_dockerhub_containers
kolla-ansible -i $INVENTORY pull
localize_latest_containers
use_localized_containers
kolla-ansible -i $INVENTORY deploy
Related
We have most, but not all of our build artifacts into a Git repo (Bitbucket).
Our current build looks like this, and takes 30+ minutes to build/deploy to Firebase, we would like to reduce the time to build.
We are not using Google Cloud Build at the moment, but before heading down that path, I want to find out if that would even be fruitful.
We have all of the code cloned from the git repo (Bitbucket), to a GCE VM.
And then 1 TB of static data is then copied into a directory under the git repo area, artifacts that are needed for the deploy.
We do not want to check in that 1TB of data into the git repo, it is from a 3rd party, it is rarely updated, and would be too heavy of a directory to pull into developer environments on their IDE's, it is pointless to do so.
We launch a build script on the GCE VM to build the code, and deploy to Firebase (bash script), it takes about 30 minutes.
We want the builds to go faster, and possible to use cloud build.
With this:
a git repo
external files that need to remain in a stateful container, not copied over each time, due to the time it would take
how do we create a stateful container that would only require a git update (pull origin master), and then to fire off a build/deploy to Firebase?
We want to avoid ingress traffic to the Firebase deploy using external build services where the 1TB of data that remains the same each and every time is sent to Firebase, where we would be billed.
Cloud Run containers are not stateful. GCE VM's are stateful, but it requires that we keep them up and going 24x7x365, so that any developer anywhere can run a build, and that may take only 30 minutes out of any day, and we don't know when that will be, so leaving it up 24x7x365 is mostly wasteful.
We want to avoid building a stateless container where the code is checked out fresh each and every time, a git pull origin master will do, and to have to copy the 1TB of artifacts into the container each and every time taking time.
We just want to do:
git pull origin master
Fire off the build as the next step in the script
spin down the container, have it save it's state for the next build, minimizing time, each and every time, saving the previous 'git pull origin master' updated artifacts, and preserving the 1TB files we copied to the container.
The ideal situation would be to have a container that is stateful, that spins down when not in use, and "spins up", or is made active for use when we need to do a build.
It would retain the previous git update (git pull origin master), and would retain all artifacts outside the git repo that we copy over. We also need shell access to the container (ssh, scp) etc.
A stateful 'Cloud Run' option would be ideal, but I don't know of such a thing (stateful containers with GCP that we can run and only be billed for runtime/compute time)
One solution is to use a VM for this. Add a startup script. In it
git pull origin master
Fire off the build as the next step in the script
Add this line which stop your compute
gcloud compute instances stop $(curl -H "Metadata-Flavor: Google" http://metadata.google.internal/computeMetadata/v1/instance/id) --zone=$(curl -H "Metadata-Flavor: Google" http://metadata.google.internal/compu
teMetadata/v1/instance/zone)
By the way, each time that you start your VM, it will apply the startup script and shutdown automatically. You keep your persistent disk, and thus your 1TB data, and you pay few because of automatic stop.
If you have to wait an external build. 2 solutions:
Either you set a sleep timer and shutdown after in the startup script
Or customize this tutorial -> At the end of your build, publish a message in PubSub, which trigger a function which will stop your instance.
EDITED to reply to comments
Here again, 2 solutions:
You can create a custom role with only the permission required. You can see all compute role here. If you provide an access to the console, I recommend list (to view the VMs), start and stop. Else, only start and/or stop if you write a script.
You can create a private function or Cloud Run. Assign a service account as identity to this, with enough role to start the VM (even if there is more permission as required -> it's not a good practice. Prefer the least privilege with custom role) and grant the role function.invoker or run.invoker to the user (depend if you use Function or Cloud Run) for allowing it to call this private endpoint and start the VM without right on the VM (only the right to perform an HTTP call).
When running Corda nodes for testing or demo purposes, I often find a need to delete all the node's data and start it again.
I know I can do this by:
Shutting down the node process
Deleting the node's persistence.mv.db file and artemis folder
Starting the node again
However, I would like to know if it is possible to delete the node's data without restarting the node, as this would be much faster.
It is not currently possible to delete the node's data without restarting the node.
If you are "resetting" the nodes for testing purposes, you should make sure that you are using the Corda testing APIs to allow your contracts and flows to be tested without actually starting a node. See the testing API docs here: https://docs.corda.net/api-testing.html.
One alternative to restarting the nodes would also be to put the demo environment in a VmWare workstation, take a snapshot of the VM while the nodes are still "clean", run the demo, and then reload the snapshot.
I''m trying to get log output (Console.WriteLine(..)) in my Docker logs, but I'm getting zero avail.
I've tried:
Console.WriteLine(..)
Trace.WriteLine(..)
Flushing the console, flushing the trace.
I can see these outputs in a VS output window when I'm debugging, so they go somoewhere.
I'm on windows Container, using microsoft/aspnet:4.7.1-windowsservercore-1709 and net4.7
These are the logs I get on container start
docker logs -f exportapi
ERROR ( message:Cannot find requested collection element. )
Applied configuration changes to section "system.applicationHost/applicationPools" for "MACHINE/WEBROOT/APPHOST" at configuration commit path "MACHINE/WEBROOT/APPHOST"
You have many good lateral options, like self-contained/server-contained executables (eg. Dotnet Core using microsoft/dotnet:runtime would proxy Console.WriteLine by default off the dotnet new web scaffold). Zero-configuration STDOUT logging has never been a common approach on IIS, but these modern options adopt it as best practice (logging should be a transparent backing service).
If you want or need a chain of three programs/assemblies to get your web service up (ServiceMonitor, W3SVC, and finally your assembly), then you need something like this: https://blog.sixeyed.com/relay-iis-log-entries-to-read-them-in-docker/
Overriding the entrypoint to tail more logs than the image does by default is unfortunately a common hack (not just in Microsoft land). So, in your case, I believe you need at least a trace listener config to emit Trace.WriteLine, and then the above approach to emit it: https://learn.microsoft.com/en-us/dotnet/framework/debug-trace-profile/how-to-create-and-initialize-trace-listeners
I am finding when working with larger datasets that the kernel may die, something I also experiance on my local machine. Sometimes it comes back and sometimes not. So even the Tree panel won't react to terminate a errant Kernel. EG "restart" does not work and the server itself seems to die. So the tree view won't respond or refresh. On my local machine I just kill the terminal instance and start over.
What is the "proper" way to restart everything?
FWIW the instance seems pegged at 150% cpu utilization atm
Related: is there any way to allow long running stuff to work?
I am trying to use a report generator (pandas-profiling) on a 2mm record dataset.. Works on my local..
found it here: https://cloud.google.com/datalab/getting-started
FWIW These commands can be used in the new command line shell on the Cloud console page.see https://cloud.google.com/shell/docs/ .. Without the sdk on your machine.. You need to modify the commands slightly since you will be logged into your project already,
Stopping/starting VM instances
You may want to stop a Cloud Datalab managed VM instance to avoid incurring ongoing charges. To stop a Cloud Datalab managed machine instance, go to a command prompt, and run:
$ gcloud auth login
$ gcloud config set project <YOUR PROJECT ID>
$ gcloud preview app versions stop main
After confirming that you want to continue, wait for the command to complete, and make sure that the output indicates that the version has stopped. If you used a non-default instance name when deploying, please use that name instead of "main" in the stop command, above (and in the start command, below).
For restarting a stopped instance, run:
$ gcloud auth login
$ gcloud config set project <YOUR PROJECT ID>
$ gcloud preview app versions start main
The server has only 64MB of memory. I'm trying to push a huge git repository to it. Initially the target directory contains an empty bare repository. The push fails:
$ git push server:/tmp/repo master
Counting objects: 3064514, done.
Compressing objects: 100% (470245/470245), done.
fatal: Out of memory, calloc failed
error: pack-objects died of signal 13
error: failed to push some refs to 'server:/tmp/repo'
$ ssh server cat /tmp/repo.git/config
[pack]
threads = 1
deltaCacheSize = 8m
windowMemory = 32m
[core]
repositoryformatversion = 0
filemode = true
bare = true
I get the same error message after changing git config pack.windowMemory 16m on the server.
The same push succeeds to localhost:
$ git push 127.0.0.1:/tmp/repo master
Password:
Counting objects: 3064514, done.
Compressing objects: 100% (470245/470245), done.
Writing objects: 100% (3064514/3064514), 703.02 MiB | 10.84 MiB/s, done.
Total 3064514 (delta 2569775), reused 3059081 (delta 2565342)
To 127.0.0.1:/tmp/repo
* [new branch] master -> master
Is there a remote git config setting which can make the push succeed? Or do I have to repack the repo locally before pushing (with what settings)?
Please note that using a different server with more memory is not an option. Adding memory to the existing server is an option, up to 96MB. It's OK for me to use more disk space than usual on the server if the memory limit is met.
Similar question without a working solution: https://serverfault.com/questions/372899/git-fails-to-push-with-error-out-of-memory
Repacking the repository locally didn't help, git push prints the same error. Repack settings in the local repo:
git config core.packedgitlimit 32m
git config core.packedgitwindowsize=32m
git config pack.threads 1
git config pack.deltacachesize 8m
git config pack.windowmemory 32m
git config pack.packsizelimit 500m
My idea is that the reason why it fails is that the total number of objects is too large: even the SHA-1 hashes won't fit (20 * 3064514 bytes is almost 64MB).
Possible other causes
As #torek pointed out in his comment, this may not be an indication of the server running out of memory, but an indication that something is going wrong locally. Perhaps something changed between when you were pushing to server and to local host that freed up memory on your local machine?
It's also plausible that git is figuring out that you're pushing to localhost, and bypassing the "Git aware" transport mechanism and/or using hardlinks, which might reduce the memory needed. I don't see any indication in the docs that it WOULD do this, and I'm not sure off the top of my head how you could test this, or force it not to do that, but it's a possibility.
Another possible issue is that the host.xz:path/to/repo.git/ url syntax is only recognized if there are no slashes before the first colon, so depending on what server is, that could be causing problems.
If none of these are the case, and the problem is in fact that it's running out of memory on the server, you might have a few options here, depending on the circumstances. I don't know if any of these will work, but they're worth a try.
Solution 1: don't push all the commits at once
I'm assuming you've got many commits in the commit history of master. Try pushing them in stages. E.g.
git push server:/tmp/repo master~500
git push server:/tmp/repo master~400
git push server:/tmp/repo master~300
git push server:/tmp/repo master~200
git push server:/tmp/repo master~100
git push server:/tmp/repo master
Solution 2: Push individual objects one at a time
This is going to be incredibly tedious and DEFINITELY need to be automated/scripted on your local machine. However, you don't actually need to push whole commits all at once.
Instead, you can push individual objects one at a time as long as you push them to a tag ref instead of a branch ref. E.g. if we were working with https://github.com/llvm/llvm-project and wanted to push the tree object 0082ee0b3ad78ff55b2a3a65ef5bfdb8cd9713a1 from it (this is the tree object pointed to by commit faf5e0ec737a676088649d7c13cb50f3f91a703a), we could do git push server:/tmp/repo 0082ee0b3ad78ff55b2a3a65ef5bfdb8cd9713a1:refs/tags/test. Using this we can push individual objects one at a time, starting with blobs, then the tree objects, then finally commit objects. We'd end up with a TON of tags to clean up later, but I'll leave that to you to figure out.
For the rest of these solutions, I'm working under a couple assumptions:
Given the limitations you described, and the way you specified the url as server:/tmp/repo instead of something ending with .git, I'm assuming this remote repository isn't going to be managed with any service like github or gitlab, which should give you a little more room to use some unconventional techniques.
I'm also assuming you probably have the ability to log on to/run
commands on the server.
If either of these are not the case, and the above didn't work, I'm out of ideas at the moment.
Solution 3: backwards push using fetch or clone
There's actually nothing special about a server, it's just another git repository that you can trade commits with. The only difference is that a server is usually hosting what's called a bare repository: it doesn't typically keep a working tree of it's own (in other words, it only keeps the contents of the .git folder).
So, try performing the push in reverse using fetch/clone from server:
Push to a third, intermediate server (let's call it server2). Ideally, one with a lot more performance, like a github hosted repo.
Log onto/ssh into server, and from there, clone the repo into /tmp/repo: git clone --bare git#github.com:path/to/your/repo.git.
I would be surprised if this solved anything on it's own, but it's worth trying and step 1 will still set us up for solution 4 and 5. If by chance it does work, you can tidy up by removing server2 as a remote on server: git remote remove origin, then setting up your remotes on your local machine to point towards server instead of server2.
Solution 4: backwards push, but without fetching all the commits at once
Like solution 3, push to an intermediate server, but this time, instead of using clone and fetching everything all at once, fetch the commits in stages:
Log onto/ssh into server, and from there, initialize /tmp/repo as a bare repo:
cd /tmp/repo
git init --bare
git remote add origin git#github.com:path/to/your/repo.git
Still on server, fetch commits one at a time:
git fetch origin 569d84fe99e63e830ea036598f7fa7a5f9899d7c
git fetch origin 9aaba9d9bb4fc3648a9417820858086b14b6b73e
git fetch origin faf5e0ec737a676088649d7c13cb50f3f91a703a
Solution 5: backwards push, but using partial and/or shallow clones
Instead of fetching individual commits, we can use partial and or shallow clones to restrict how much we are fetching at once. There is a good write-up explaining what those are on the github blog: https://github.blog/2020-12-21-get-up-to-speed-with-partial-clone-and-shallow-clone/. This time, we can't won't use a bare repository. We want to be able to check out commits to fill in the missing objects later. You can follow the instructions here to convert it to a bare repository when you're done. Alternatively, instead of using a regular (non-bare) repository, explicitly fetching the objects might also work, but I don't know for sure off the top of my head.
I think everything I've already written, combined with that write-up should give you all the pieces you need to figure out how to do this. I've already spent hours writing this up, it's late, this solution's kind of complicated, and it's an esoteric question that hasn't been touched in years. If somebody comes across this and needs a more complete answer for this, leave a comment and I'll fill it in, but this is as far as I'm willing to go right now for some potential internet points if nobody actually needs this answer XD.