How can the data of a Corda node be deleted without restarting the node? - corda

When running Corda nodes for testing or demo purposes, I often find a need to delete all the node's data and start it again.
I know I can do this by:
Shutting down the node process
Deleting the node's persistence.mv.db file and artemis folder
Starting the node again
However, I would like to know if it is possible to delete the node's data without restarting the node, as this would be much faster.

It is not currently possible to delete the node's data without restarting the node.
If you are "resetting" the nodes for testing purposes, you should make sure that you are using the Corda testing APIs to allow your contracts and flows to be tested without actually starting a node. See the testing API docs here: https://docs.corda.net/api-testing.html.
One alternative to restarting the nodes would also be to put the demo environment in a VmWare workstation, take a snapshot of the VM while the nodes are still "clean", run the demo, and then reload the snapshot.

Related

CloudFoundry App instances - EF Core database migration

I've written a .NET Core Rest API which does migrate/ update the database (using Entity Framework Core) in Startup.cs. Currently, only one instance is running in the production environment. It seems to be recommended to run 2 instances in production.
What happens while executing the cf push command? Are both instances stopped automatically or do I need to execute cf stop?
In addition, how do I prevent both instances from updating the database?
I've read about the CF_INSTANCE_INDEX environment variable. Is it OK to only start the database migration when CF_INSTANCE_INDEX is 0? Or does CloudFoundry provide the next mechanism: start the first instance and when this one is up-and-running, the second instance will be started?
What happens while executing the cf push command? Are both instances stopped automatically or do I need to execute cf stop?
Yes, your app will stop. The new code will stage (i.e. buildpacks run) and produce a droplet. Then the system will bring up all the requested instances using the new droplet.
In addition, how do I prevent both instances from updating the database? I've read about the CF_INSTANCE_INDEX environment variable. Is it OK to only start the database migration when CF_INSTANCE_INDEX is 0?
You can certainly do it that way. The instance number is guaranteed to be unique and the zeroth instance will always exist, so if you limit to the zeroth instance then it's guaranteed to only run once.
Another option is to run your migration as a task (i.e. cf run-task). This runs in its own container, so it would only run once regardless of the number of instances you have. This SO post has some tips about running a migration as a task.
Or does CloudFoundry provide the next mechanism: start the first instance and when this one is up-and-running, the second instance will be started?
It does, it's the --strategy=rolling flag for cf push.
See https://docs.cloudfoundry.org/devguide/deploy-apps/rolling-deploy.html
I'm not sure that this feature would work for ensuring your migration runs only once. According to the docs (See "How it works" section at the link above), your new and old containers could overlap for a short period of time. If that's the case, running the migration could potentially break the old instances. It'll be a short period of time, just until they get replaced with new instances, but maybe something to consider.

How to pin openstack container versions when using kolla-ansible?

When installing openstack via kolla-ansible you specify openstack version in globals.yml, ie: openstack_release: "victoria". This is as specific as you can get, there are no point-in-time tags, just a moving target like "victoria".
In my experience containers are updated randomly, not all-at-once, and frequently. Every time I rebuild I'm having to wait for docker to pull down things which have changed since my last deploy. This is problematic for multiple reasons, most acutely:
This is a fast-moving community-driven project. I'm having to work through new issues every few times I rebuild as a result of changes.
If I deploy onto one set of hosts, then deploy onto more hosts hours later, I'm waiting again on updates, and my stack is running containers of different versions.
These pulls take time and make my deployments vulnerable to timeouts and network problems.
To emphasize what a problem the second issue is, usually I can reset a failed deployment and try again, but not always. There have been times where I had residual issues, and due to my noobness it was quicker to dump fresh disks and start over. I'm using external ceph (the only ceph option in kolla-ansible:victoria), colocated with the compute nodes. Resetting pool / OSD state to an earlier point in time isn't in my toolbox yet, so I also wipe my OSD's and redo the ceph installation. I can pin version on ceph containers, but I start to sweat once the kolla-ansible installation starts. For a 4-hour total install, there's a not-small chance that another container will change in this time.
The obvious answer for anybody who does IT or software professionally is to pin my kolla:* container versions to a specific point-in-time tag, and not "victoria". I could pin each container to a digest, but that's not supported in the playbooks as written. I'd need to edit ansible playboooks and add a variable for every container that I want to pin. And then maintain that logic as new containers are added. I'm pulling 43 containers right now. This approach feels like "2 trailer park girls go 'round the outside".
A far simpler approach which I'm planning is to pull all the "victoria"-tagged containers, and then iterate through pushing them back into my own docker repo (eg, "victoria-feralcoder-20120321"), and then update globals.yml to use this stable tag. I'm new to managing my own docker repos, so I don't know if I can retag images in a pull-through cache, or if I need to set up a private repo for that, so I may also have to switch kolla-ansible between docker.io and a private feralcoder repo, depending on whether I want to do a latest-pull or a pinned-pull. That would be a little "hey nineteen", cleaner and nicer, still not quite right...
I feel like this pull-retag-push-reconfigure-redeploy approach is hack jankery. Does anybody have a better suggestion? Like, to not check upstream for container changes if there's already a tag-match in the local mirror? Or maybe a way to pull-thru-and-retag, at the registry level?
Thanks, in advance, and also thanks to the kolla-ansible contributors for all their work, short of not providing version stability.
Here is one answer, for an existing deployment:
If you have already pulled containers to all your hosts, you can edit some ansible or python so that docker_container.pull=false for all containers.
This is the implementing module:
.../lib/python3.6/site-packages/ansible/modules/cloud/docker/docker_container.py.
This file might be in /usr/local/share/kolla-ansible/, or .../venvs/kolla-ansible/. When false, if the container exists on the host it won't be repulled.
This doesn't help the situation where a host hasn't yet pulled the package and you have a version already in your local mirror. In that situation, the stack host will pull the container, and your pull-through cache will pull down any container updates since last pull.
This is my current preferred solution, which is still, admittedly, a hack:
Pull the latest images as a batch, then tag them and push them to a local registry.
First, I need 2 docker registries: I can't push to a pull-through cache, so I also needed to set up a private registry, which I can push to.
I need to toggle settings in globals.yml back and forth during kolla-ansible deploy to achieve this:
When I run "kolla-ansible bootstrap-servers" I need the local registry configured, so that stack hosts are configured with appropriate insecure-registries configs.
I use "kolla-ansible pull" to prefetch the latest packages, when I want to update. For this I reconfigure globals.yml to point at kolla/*:victoria.
After I fetch the latest containers, I run a loop on one of my stack hosts to pull them from my pull-through cache, tag them to my local registry with a date stamp tag, and push them to my local registry.
Before I run the actual deploy I configure globals.yml to use my local registry and tags.
These are the globals.yml settings of interest:
## PINNED CONTAINER VERSIONS
#docker_registry: 192.168.127.220:4001
#docker_namespace: "feralcoder"
#openstack_release: "feralcoder-20210321"
# LATEST CONTAINER VERSIONS
docker_registry:
docker_registry_username: feralcoder
docker_namespace: "kolla"
openstack_release: "victoria"
My pseudocode is like this (intermediate steps pruned...):
use_localized_containers () {
cp $KOLLA_SETUP_DIR/files/kolla-globals-localpull.yml /etc/kolla/globals.yml
cat $KOLLA_SETUP_DIR/files/kolla-globals-remainder.yml >> /etc/kolla/globals.yml
}
use_latest_dockerhub_containers () {
# We switch to dockerhub container fetches, to get the latest "victoria" containers
cp $KOLLA_SETUP_DIR/files/kolla-globals-dockerpull.yml /etc/kolla/globals.yml
cat $KOLLA_SETUP_DIR/files/kolla-globals-remainder.yml >> /etc/kolla/globals.yml
}
localize_latest_containers () {
for CONTAINER in `ls $KOLLA_PULL_THRU_CACHE`; do
ssh_control_run_as_user root "docker image pull kolla/$CONTAINER:victoria" $PULL_HOST
ssh_control_run_as_user root "docker image tag kolla/$CONTAINER:victoria $LOCAL_REGISTRY/feralcoder/$CONTAINER:$TAG" $PULL_HOST
ssh_control_run_as_user root "docker image push $LOCAL_REGISTRY/feralcoder/$CONTAINER:$TAG" $PULL_HOST
done
}
use_localized_containers
kolla-ansible -i $INVENTORY bootstrap-servers
use_latest_dockerhub_containers
kolla-ansible -i $INVENTORY pull
localize_latest_containers
use_localized_containers
kolla-ansible -i $INVENTORY deploy

Google Cloud Build - how to do a build/deploy to Firebase with a stateful container that "spins down" to 0 (like cloud run)

We have most, but not all of our build artifacts into a Git repo (Bitbucket).
Our current build looks like this, and takes 30+ minutes to build/deploy to Firebase, we would like to reduce the time to build.
We are not using Google Cloud Build at the moment, but before heading down that path, I want to find out if that would even be fruitful.
We have all of the code cloned from the git repo (Bitbucket), to a GCE VM.
And then 1 TB of static data is then copied into a directory under the git repo area, artifacts that are needed for the deploy.
We do not want to check in that 1TB of data into the git repo, it is from a 3rd party, it is rarely updated, and would be too heavy of a directory to pull into developer environments on their IDE's, it is pointless to do so.
We launch a build script on the GCE VM to build the code, and deploy to Firebase (bash script), it takes about 30 minutes.
We want the builds to go faster, and possible to use cloud build.
With this:
a git repo
external files that need to remain in a stateful container, not copied over each time, due to the time it would take
how do we create a stateful container that would only require a git update (pull origin master), and then to fire off a build/deploy to Firebase?
We want to avoid ingress traffic to the Firebase deploy using external build services where the 1TB of data that remains the same each and every time is sent to Firebase, where we would be billed.
Cloud Run containers are not stateful. GCE VM's are stateful, but it requires that we keep them up and going 24x7x365, so that any developer anywhere can run a build, and that may take only 30 minutes out of any day, and we don't know when that will be, so leaving it up 24x7x365 is mostly wasteful.
We want to avoid building a stateless container where the code is checked out fresh each and every time, a git pull origin master will do, and to have to copy the 1TB of artifacts into the container each and every time taking time.
We just want to do:
git pull origin master
Fire off the build as the next step in the script
spin down the container, have it save it's state for the next build, minimizing time, each and every time, saving the previous 'git pull origin master' updated artifacts, and preserving the 1TB files we copied to the container.
The ideal situation would be to have a container that is stateful, that spins down when not in use, and "spins up", or is made active for use when we need to do a build.
It would retain the previous git update (git pull origin master), and would retain all artifacts outside the git repo that we copy over. We also need shell access to the container (ssh, scp) etc.
A stateful 'Cloud Run' option would be ideal, but I don't know of such a thing (stateful containers with GCP that we can run and only be billed for runtime/compute time)
One solution is to use a VM for this. Add a startup script. In it
git pull origin master
Fire off the build as the next step in the script
Add this line which stop your compute
gcloud compute instances stop $(curl -H "Metadata-Flavor: Google" http://metadata.google.internal/computeMetadata/v1/instance/id) --zone=$(curl -H "Metadata-Flavor: Google" http://metadata.google.internal/compu
teMetadata/v1/instance/zone)
By the way, each time that you start your VM, it will apply the startup script and shutdown automatically. You keep your persistent disk, and thus your 1TB data, and you pay few because of automatic stop.
If you have to wait an external build. 2 solutions:
Either you set a sleep timer and shutdown after in the startup script
Or customize this tutorial -> At the end of your build, publish a message in PubSub, which trigger a function which will stop your instance.
EDITED to reply to comments
Here again, 2 solutions:
You can create a custom role with only the permission required. You can see all compute role here. If you provide an access to the console, I recommend list (to view the VMs), start and stop. Else, only start and/or stop if you write a script.
You can create a private function or Cloud Run. Assign a service account as identity to this, with enough role to start the VM (even if there is more permission as required -> it's not a good practice. Prefer the least privilege with custom role) and grant the role function.invoker or run.invoker to the user (depend if you use Function or Cloud Run) for allowing it to call this private endpoint and start the VM without right on the VM (only the right to perform an HTTP call).

Gridgain node discovery and gridcache

I start the grid gain node using G.start(gridConfiguration), the node automatically joins the existing nodes.After this I start loading the GridCache ( which is configured to be LOCAL ).
This works fine, but is there is way to access the Grid Cache without doing the G.start(gridConfiguration), since I would like to load the LOCAL cache first and then have the node being detected by other nodes once the cache is loaded succesfully
You need to have GridGain started in order to use it's API's. After the grid is started, you can access it using GridGain.grid().cache(...) method.
What you can do, for example, is use distributed count down latch (GridCacheCountDownLatch) which is exactly the same as java.util.concurrent.CountDownLatch class. Then you can have other nodes wait on the latch while your local cache is loading. Once loading is done, you can call latch.countDown() and other nodes will be able to proceed.
More information on count-down-latch, as well as other concurrent data structures in GridGain can be found in documentation.

How to lock node for deletion process

Within alfresco, I want to delete a node but I don't want to be used by any other users in a cluster environment.
I know that I will use LockService for lock a node (in a cluster environment) as in the folloing lines:
lockService.lock(deleteNode);
nodeService.deleteNode(deleteNode);
lockService.unlock(deleteNode);
the last line may cause an exception because the node has already been deleted, and indeed it causes the exception is
A system error happened during the operation: Node does not exist: workspace://SpacesStore/cb6473ed-1f0c-4fa3-bfdf-8f0bc86f3a12
So how to ensure concurrency in a cluster environment when delete a node to prevent two users to access the same node at the same time one of them want to update it and the second once want o delete it?
Depending on your cluster environment (e.g. same DB server used by all Alfresco instances), transactions might most likely just be enough to ensure no stale content is used:
serverA(readNode)
serverB(deleteNode)
serverA(updateNode) <--- transaction failure
The JobLockService allows more control in case of more complex operations, which might involve multiple, dynamic nodes (or no nodes at all, e.g. sending emails or similar):
serverA(acquireLock)
serverB(acquireLock) <--- wait for the lock to be released
serverA(readNode1)
serverA(if something then updateNode2)
serverA(updateNode1)
serverA(releaseLock)
serverB(readNode2)
serverB(releaseLock)

Resources